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Foreword 


This book is meant as a text for a first year graduate course in analysis. 
Any standard course in undergraduate analysis will constitute sufficient 
preparation for its understanding, for instance, my Undergraduate Anal- 
ysis. I assume that the reader is acquainted with notions of uniform con- 
vergence and the like. 

In this third edition, I have reorganized the book by covering inte- 
gration before functional analysis. Such a rearrangement fits the way 
courses are taught in all the places I know of. I have added a number of 
examples and exercises, as well as some material about integration on the 
real line (e.g. on Dirac sequence approximation and on Fourier analysis), 
and some material on functional analysis (e.g. the theory of the Gelfand 
transform in Chapter XVI). These upgrade previous exercises to sections 
in the text. 

In a sense, the subject matter covers the same topics as elementary 
calculus, viz. linear algebra, differentiation and integration. This time, 
however, these subjects are treated in a manner suitable for the training 
of professionals, i.e. people who will use the tools in further investiga- 
tions, be it in mathematics, or physics, or what have you. 

In the first part, we begin with point set topology, essential for all 
analysis, and we cover the most important results. 

I am selective here, since this part is regarded as a tool, especially 
Chapters I and IJ. Many results are easy, and are less essential than 
those in the text. They have been given in exercises, which are designed 
to acquire facility in routine techniques and to give flexibility for those 
who want to cover some of them at greater length. The point set topol- 
ogy simply deals with the basic notions of continuity, open and closed 
sets, connectedness, compactness, and continuous functions. The chapter 
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concerning continuous functions on compact sets properly emphasizes 
results which already mix analysis and uniform convergence with the 
language of point set topology. 

In the second part, Chapters IV and V, we describe briefly the two 
basic linear spaces of analysis, namely Banach spaces and Hilbert spaces. 

The next part deals extensively with integration. 

We begin with the development of the integral. The fashion has been 
to emphasize positivity and ordering properties (increasing and decreas- 
ing sequences). I find this excessive. The treatment given here attempts 
to give a proper balance between L'‘-convergence and positivity. For 
more detailed comments, see the introduction to Part Three and Chapter 
VI. 

The chapters on applications of integration and distributions provide 
concrete examples and choices for leading the course in other directions, 
at the taste of the lecturer. The general theory of integration in mea- 
sured spaces (with respect to a given positive measure) alternates with 
chapters giving specific results of integration on euclidean spaces or the 
real line. Neither is slighted at the expense of the other. In this third 
edition, I have added some material on functions of bounded variation, 
and I have emphasized convolutions and the approximation by Dirac 
sequences or families even more than in the previous editions, for in- 
stance, in Chapter VIII, §2. 

For want of a better place, the calculus (with values in a Banach 
space) now occurs as a separate part after dealing with integration, and 
before the functional analysis. 

The differential calculus is done because at best, most people will only 
be acquainted with it only in euclidean space, and incompletely at that. 
More importantly, the calculus in Banach spaces has acquired consider- 
able importance in the last two decades, because of many applications 
like Morse theory, the calculus of variations, and the Nash—Moser im- 
plicit mapping theorem, which lies even further in this direction since one 
has to deal with more general spaces than Banach spaces. These results 
pertain to the geometry of function spaces. Cf. the exercises of Chapter 
XIV for simpler applications. 

The next part deals with functional analysis. The purpose here is 
twofold. We place the linear algebra in an infinite dimensional setting 
where continuity assumptions are made on the linear maps, and we show 
how one can “linearize” a problem by taking derivatives, again in a 
setting where the theory can be applied to function spaces. This part 
includes several major spectral theorems of analysis, showing how we can 
extend to the infinite dimensional case certain results of finite dimen- 
sional linear algebra. The compact and Fredholm operators have appli- 
cations to integral operators and partial differential elliptic operators (e.g. 
in papers of Atiyah—Singer and Atiyah—Bott). 

Chapters XIX and XXIX, on unbounded hermitian operators, combine 
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both the linear algebra and integration theory in the study of such 
Operators. One may view the treatment of spectral measures as providing 
an example of general integration theory on locally compact spaces, 
whereby a measure is obtained from a functional on the space of contin- 
uous functions with compact support. 

I find it appropriate to introduce students to differentiable manifolds 
during this first year graduate analysis course, not only because these 
objects are of interest to differential geometers or differential topologists, 
but because global analysis on manifolds has come into its own, both in 
its integral and differential aspects. It is therefore desirable to integrate 
manifolds in analysis courses, and I have done this in the last part, which 
may also be viewed as providing a good application of integration theory. 

A number of examples are given in the text but many interesting 
examples are also given in the exercises (for instance, explicit formulas for 
approximations whose existence one knows abstractly by the Weierstrass— 
Stone theorem; integral operators of various kinds; etc). The exercises 
should be viewed as an integral part of the book. Note that Chapters 
XIX and XX, giving the spectral measure, can be viewed as providing 
an example for many notions which have been discussed previously: 
operators in Hilbert space, measures, and convolutions. At the same 
time, these results lead directly into the real analysis of the working 
mathematician. 

As usual, I have avoided as far as possible building long chains of 
logical interdependence, and have made chapters as logically independent 
as possible, so that courses which run rapidly through certain chapters, 
omitting some material, can cover later chapters without being logically 
inconvenienced. 

The present book can be used for a two-semester course, omitting 
some material. I hope I have given a suitable overview of the basic tools 
of analysis. There might be some reason to include other topics, such as 
the basic theorems concerning elliptic operators. I have omitted this 
topic and some others, partly because the appendices to my SL,(R) 
constitutes a sub-book which contains these topics, and partly because 
there is no time to cover them in the basic one year course addressed to 
graduate students. 

The present book can also be used as a reference for basic analysis, 
since it offers the reader the opportunity to select various topics without 
reading the entire book. The subject matter is organized so that it makes 
the topics available to as wide an audience as possible. 

There are many very good books in intermediate analysis, and inter- 
esting research papers, which can be read immediately after the present 
course. A partial list is given in the Bibliography. In fact, the determina- 
tion of the material included in this Real and Functional Analysis has 
been greatly motivated by the existence of these papers and books, and 
by the need to provide the necessary background for them. 
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Finally, I thank all those people who have made valuable comments 
and corrections, especially Keith Conrad, Martin Mohlenkamp, Takesi 
Yamanaka, and Stephen Chiappari, who reviewed the book for Springer- 
Verlag. 


New Haven 1993/1996 SERGE LANG 
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PART ONE 


General Topology 


CHAPTER | 


Sets 


I, §1. SOME BASIC TERMINOLOGY 


We assume that the reader understands the meaning of the word “set”, 
and in this chapter, summarize briefly the basic properties of sets and 
operations between sets. We denote the empty set by @. A subset S’ of 
S is said to be proper if S’ 4 S. We write S’c S or S>S’ to denote the 
fact that S’ is a subset of S. 

Let S, T be sets. A mapping or map f: T—S is an association which 
to each element x € T associates an element of S, denoted by f(x), and 
called the value of f at x, or the image of x under f. If T’ is a subset of 
T, we denote by f(T’) the subset of S consisting of all elements f(x) for 
xeéT’. The association of f(x) to x is denoted by the special arrow 


xr f(x). 


We usually reserve the word function for a mapping whose values are in 
the real or complex numbers. The characteristic function of a subset S’ of 
S is the function y such that y(x) = 1 if xe S’ and y(x)=0 if x € S’. We 
often write y,, for this function. 

Let X, Y be sets. A map f: X > Y is said to be injective if for all x, 
x’e X with x 4x’ we have f(x) # f(x’). We say that f is surjective if 
f(X) = Y, ie. if the image of f is all of Y. We say that f is bijective if it 
is both injective and surjective. As usual, one should index a map f by 
its set of arrival and set of departure to have absolutely correct notation, 
but this is too clumsy, and the context is supposed to make it clear what 
these sets are. For instance, let R denote the real numbers, and R’ the 
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real numbers 2 0. The map 
fe: ROR 
given by xt» x? is not surjective, but the map 
fe. R-R’ 


given by the same formula is surjective. 
If f: X — Y is a map and S a subset of X, we denote by 


f\s 


the restriction of f to S, namely the map f viewed as a map defined only 
on S. For instance, if f: RR’ is the map xt»x?, then f is not injec- 
tive, but f|R’ is injective. We often let f; = fy; be the function equal to 
f on S and 0 outside S. 

A composite of injective maps is injective, and a composite of surjec- 
tive maps is surjective. Hence a composite of bijective maps is bijective. 

We denote by Q, Z the sets of rational numbers and integers respec- 
tively. We denote by Z* the set of positive integers (integers > 0), and 
similarly by R* the set of positive reals. We denote by N the set of 
natural numbers (integers = 0), and by C the complex numbers. A map- 
ping into R or C will be called a function. 

Let S and I be sets. By a family of elements of S, indexed by J, one 
means simply a map f:1—S. However, when we speak of a family, we 
write f(i) as f;, and also use the notation { f;!;.,; to denote the family. 


Example 1. Let S be the set consisting of the single element 3. Let 
I = {1,...,n} be the set of integers from 1 to n. A family of elements of 


that a family is different from a subset. The same element of S may 
receive distinct indices. 

A family of elements of a set S indexed by positive integers, or non- 
negative integers, is also called a sequence. 


Example 2. A sequence of real numbers is written frequently in the 
form 


{X1,X2,...} or {Xnbn>1 


and stands for the map f: Z* >R such that f(i)=.x;. As before, note 
that a sequence can have all its elements equal to each other, that is 


{1,1,1,...3 


is a sequence of integers, with x; = 1 for each ie Z”. 


[I, §1] SOME BASIC TERMINOLOGY 5 


We define a family of sets indexed by a set J in the same manner, that 
is, a family of sets indexed by IJ is an assignment 


it S; 


which to each ie J associates a set S;. The sets S; may or may not have 
elements in common, and it is conceivable that they may all be equal. 
As before, we write the family {S;},_,. 

We can define the intersection and union of families of sets, just as for 
the intersection and union of a finite number of sets. Thus, if {S;};.,; is a 
family of sets, we define the intersection of this family to be the set 


any 


iel 


consisting of all elements x which lie in all S;. We define the union 


US 


iel 


to be the set consisting of all x such that x lies in some §;. 

If S, S’ are sets, we define S x S’ to be the set of all pairs (x, y) with 
xeéS and yeS’. We can define finite products in a similar way. If S,, 
S,,... 18 a sequence of sets, we define the product 


~~ 
Il 
i 


to be the set of all sequences (x,,x,,...) with x;eéS;. Similarly, if J is an 
indexing set, and {S,};.,; a family of sets, we define the product 


ITS; 


iel 


to be the set of all families {x;};., with x; € S;. 
Let X, Y, Z be sets. We have the formula 


(X UY) x Z=(X x Z)U(Y x Z). 


To prove this, let (w,z)e(X UY) x Z with we X UY and zeZ. Then 
wexX orweY. Say we X. Then (w, z)eX x Z. Thus 


(X UY) x Zc(X x Z)U(Y x Z). 


Conversely, X x Z is contained in (X UY) x Z and so is Y x Z. Hence 
their union is contained in (X U Y) x Z, thereby proving our assertion. 
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We say that two sets X, Y are disjoint if their intersection is empty. 
We say that a union X UY is disjoint if X and Y are disjoint. Note that 
if X, Y are disjoint, then (X x Z) and (Y x Z) are disjoint. 

We can take products with arbitrary families. For instance, if {X;};., 
is a family of sets, then 


(Ux) xZ=Uax2 


iel ie] 


If the family {X;};-,; is disjoint (that is X; X; is empty if i 4j for i, 
jeéJ), then the sets X; x Z are also disjoint. 
We have similar formulas for intersections. For instance, 


(X OY) x Z=(X x Z)N(Y x Z). 


We leave the proof to the reader. 

Let X be a set and Y a subset. The complement of Y in X, denoted 
by @,Y, or X — Y, is the set of all elements xe X such thatx¢é Y. If Y, 
Z are subsets of X, then we have the following formulas: 


€,(Y UZ) = €xYO6xZ, 
6y(Y AZ) = 6, YU€E,Z. 


These are essentially reformulations of definitions. For instance, suppose 
xeX and x¢(YuUZ). Then x¢ Y and x€Z. Hence xE€G, YO 6,Z. 
Conversely, if xe G,.YO@,Z, then x lies neither in Y nor in Z, and 
hence x € @,(Y UZ). This proves the first formula. We leave the second 
to the reader. Exercise: Formulate these formulas for the complement of 
the union of a family of sets, and the complement of the intersection of a 
family of sets. 

Let A, B be sets and f: A—B a mapping. If Y is a subset of B, we 
define f~'(Y) to be the set of all x e A such that f(x)e Y. It may be that 
f*(Y) is empty, of course. We call f~'(Y) the inverse image of Y (under 
f). If f is injective, and Y consists of one element y, then f~‘({y}) is 
either empty or has precisely one element. 

The following statements are easily proved: 


If f: A— B is a map, and Y, Z are subsets of B, then 


A (YoD=f Mv ff), 
fUYaZ) =f (VYof (2). 


More generally, if {Y,};-,; is a family of subsets of B, then 


(YU x)= U sao 
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and similarly for the intersection. Furthermore, if we denote by Y — Z 
the set of all elements ye Y and y¢ Z, then 


{UV -2Z)=f7(%) -f 7). 
In particular, 
f*(@sZ) — €,f~*(Z). 


Thus the operation f~! commutes with all set theoretic operations. 


I, §2. DENUMERABLE SETS 


Let n be a positive integer. Let J, be the set consisting of all integers k, 
1<ke<n. If S is a set, we say that S has n elements if there is a 
bijection between S and J,. Such a bijection associates with each integer 
k as above an element of S, say kt»a,. Thus we may use J, to “count” 
S. Part of what we assume about the basic facts concerning positive 
integers is that if S has n elements, then the integer n is uniquely deter- 
mined by S. 

One also agrees to say that a set has 0 elements if the set is empty. 

We shall say that a set S is denumerable if there exists a bijection of 
S with the set of positive integers Z*. Such a bijection is then said to 
enumerate the set S. It is a mapping 


nt a, 


which to each positive integer n associates an element of S, the mapping 
being injective and surjective. 

If D is a denumerable set, and f:S—D is a bijection of some set S 
with D, then S is also denumerable. Indeed, there is a bijection g: D > Z", 
and hence go f is a bijection of S with Z”. 

Let T be a set. A sequence of elements of T is simply a mapping of 
Z* into T. If the map is given by the association nt»x,, we also write 
the sequence as {x,},>,, or also {x,,x2,...}. For simplicity, we also 
write {x,} for the sequence. Thus we think of the sequence as prescrib- 
ing a first, second, ..., n-th element of TJ. We use the same braces for 
sequences as for sets, but the context will always make our meaning 
clear. 


Examples. The even positive integers may be viewed as a sequence 
{x,} if we put x, = 2n for n=1, 2, .... The odd positive integers may 
also be viewed as a sequence {y,} if we put y, =2n—1 forn=1, 2,.... 
In each case, the sequence gives an enumeration of the given set. 


We also use the word sequence for mappings of the natural numbers 
into a set, thus allowing our sequences to start from 0 instead of 1. If we 
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need to specify whether a sequence starts with the 0-th term or the first 
term, we write 


iXntn 20 or {Xn}n21 


according to the desired case. Unless otherwise specified, however, we 
always assume that a sequence will start with the first term. Note 
that from a sequence {x,},> 9 we can define a new sequence by letting 
y, = X,-; forn=1. Then y; = Xo, y, =X,,.... Thus there is no essen- 
tial difference between the two kinds of sequences. 

Given a sequence {x,}, we call x, the n-th term of the sequence. A 
sequence may very well be such that all its terms are equal. For in- 
stance, if we let x, = 1 for all n= 1, we obtain the sequence {1, 1, 1,...}. 
Thus there is a difference between a sequence of elements in a set 7, and 
a subset of T. In the example just given, the set of all terms of the 
sequence consists of one element, namely the single number 1. 

Let {x,, x 2,...} be a sequence in a set S. By a subsequence we shall 
mean a sequence {X,,,X,,,--.$ such that ny <n,<-::. For instance, if 
{x,} is the sequence of positive integers, x, =n, the sequence of even 
positive integers {x,,} is a subsequence. | 

An enumeration of a set S is of course a sequence in S. 

A set is finite if the set is empty, or if the set has n elements for some 
positive integer n. If a set is not finite, it is called infinite. 

Occasionally, a map of J, into a set T will be called a finite sequence 
in T. A finite sequence is written as usual, 


{X1, 1. 65Xq} or (Xi)i=1, 00 


When we need to specify the distinction between finite sequences and 
maps of Z* into T, we call the latter infinite sequences. Unless otherwise 
specified, we shall use the word “sequence” to mean infinite sequence. 


Proposition 2.1. Let D be an infinite subset of Z". Then D is de- 
numerable, and in fact there is a unique enumeration of D, namely 
{k,,k,,...} such that 


ki <ky << ky < kya <0. 


Proof. We let k, be the smallest element of D. Suppose inductively 
that we have defined k, <-::<k, in such a way that any element k in D 
which is not equal to k,, ...,k, is > k,. We define k,,, to be the 
smallest element of D which is > k,. Then the map n+r>k, is the desired 
enumeration of D. 


Corollary 2.2. Let S be a denumerable set and D an infinite subset of S. 
Then D is denumerable. 
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Proof. Given an enumeration of S, the subset D corresponds to a 
subset of Z* in this enumeration. Using Proposition 2.1 we conclude 
that we can enumerate D. 


Proposition 2.3. Every infinite set contains a denumerable subset. 


Proof. Let S be a infinite set. For every non-empty subset T of S, we 
select a definite element a; in T. We then proceed by induction. We let 
x, be the chosen element a,;. Suppose that we have chosen x,, ...,x, 
having the property that for each k=2, ...,n the element x, is the 
selected element in the subset which is the complement of {x,,...,x,_,}. 
We let x,4, be the selected element in the complement of the set 
{x,,--..,X,}. By induction, we thus obtain an association nx, for all 
positive integers n, and since x, # x, for all k<n it follows that our 
association is injective, i.e. gives an enumeration of a subset of S. 


Proposition 2.4. Let D be a denumerable set, and f: D—S a surjective 
mapping. Then S is denumerable or finite. 


Proof. For each ye S, there exists an element x, € D such that f(x,) = 
y because f is surjective. The association y+>x, is an injective mapping 
of S into D, because if y, ze S and x, = x,, then 


y = f(xy) = f(x,) = z. 


Let g(y)=x,. The image of g is a subset of D and is denumerable. 
Since g is a bijection between S and its image, it follows that S is 
denumerable or finite. 


Proposition 2.5. Let D be a denumerable set. Then D x D (the set of 
all pairs (x, y) with x, y € D) is denumerable. 


Proof. There is a bijection between D x D and Z* x Z*, so it will 
suffice to prove that Z* x Z* is denumerable. Consider the mapping of 
Z* x Z* >Z given by 

(m, n)r> 2"3™, 
In view of Proposition 2.1, it will suffice to prove that this mapping is 
injective. Suppose 2”3” = 2'3° for positive integers n, m, r, s. Say r <n. 
Dividing both sides by 2’, we obtain 
dk 3" — 35 


with k=n—rzZi. Then the left-hand side is even, but the right-hand 
side is odd, so the assumption r <n is impossible. Similarly, we cannot 
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have n<r. Hence r=n. Then we obtain 3" = 3°. If m>s, then 3" *=1 
which is impossible. Similarly, we cannot have s>m, whence m=s. 
Hence our map is injective, as was to be proved. 


Proposition 2.6. Let {D,,D,,...} be a sequence of denumerable sets. 
Let S be the union of all sets D,; (i= 1, 2,...).. Then S is denumerable. 


Proof. For each i=1, 2, ... we enumerate the elements of D,, as 
indicated in the following notation: 


Dy: {X115%125%139--+$ 


Dy: {X215%225X235--+$ 


D;: {Xit> Xi25 Xi35 .S 


The map f: Z* x Z* —D given by 
fli, j) = Xij 


is then a surjective map of Z* x Z* onto S. By Proposition 2.4, it 
follows that S is denumerable. 


Corollary 2.7. Let F be a non-empty finite set and D a denumerable set. 
Then F x D is denumerable. If S,, S,, ... are a sequence of sets, 
each of which is finite or denumerable, then the union S,US,U°:: is 
denumerable or finite. 


Proof. There is an injection of F into Z* and a bijection of D with 
Z*. Hence there is an injection of F x D into Z* x Z* and we can 
apply Corollary 2.2 and Proposition 2.6 to prove the first statement. 
One could also define a surjective map of Z* x Z* onto F x D. As for 
the second statement, each finite set is contained in some denumerable 
set, so that the second statement follows from Propositions 2.1 and 2.6. 


For convenience, we shall say that a set is countable if it is either finite 
or denumerable. 
, §3. ZORN’S LEMMA 
In order to deal efficiently with infinitely many sets simultaneously, one 


needs a special property. To state it, we need some more terminology. 
Let S be a set. An ordering (also called partial ordering) of (or on) S 
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is a relation, written x < y, among some pairs of elements of S, having 
the following properties. 


ORD 1. We have x <x. 
ORD 2. If x < yand ySz then x Sz. 
ORD 3. If x Sy and ySx thenx=y. 


We sometimes write y = x for x < y. Note that we don’t require that the 
relation x < y or y <x hold for every pair of elements (x, y) of S. Some 
pairs may not be comparable. If the ordering satisfies this additional 
property, then we say that it 1s a total ordering. 


Example 1. Let G be a group. Let S be the set of subgroups. If H, 
H’ are subgroups of G, we define 


H<H' 


if H is a subgroup of H’. One verifies immediately that this relation 
defines an ordering on S. Given two subgroups, H, H’ of G, we do not 
necessarily have H < H' or H' < H. 


Example 2. Let R be a ring, and let S be the set of left ideals of R. 
We define an ordering in S in a way similar to the above, namely if L, L’ 
are left ideals of R, we define 

LsL' 
ifLo L’. 


Example 3. Let X be a set, and S the set of subsets of X. If Y, Z are 
subsets of X, we define Y < Z if Y is a subset of Z. This defines an 
ordering on S. 


In all these examples, the relation of ordering is said to be that of 
inclusion. 

In an ordered set, if x < y and x # y we then write x < y. 

Let A be an ordered set, and B a subset. Then we can define an 
ordering on B by defining x < y for x, ye B to hold if and only ifx<y 
in A. We shall say that it is the ordering on B induced by the ordering 
on A, or is the restriction to B of the partial ordering of A. 

Let S be an ordered set. By a least element of S (or a smallest 
element) one means an element aeéS such that a <x for all xe S. Simi- 
larly, by a greatest element one means an element b such that x < b for 
all x eS. 

By a maximal element m of S one means an element such that if xeS 
and x 2m, then x =m. Note that a maximal element need not be a 
greatest element. There may be many maximal elements in S, whereas if 
a greatest element exists, then it is unique (proof?). 
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Let S be an ordered set. We shall say that S is totally ordered if given 
x, ye S we have necessarily x < y or y <x. 


Example 4. The integers Z are totally ordered by the usual ordering. 
So are the real numbers. 


Let S be an ordered set, and T a subset. An upper bound of T (in S) 
is an element be S such that x < b for all xe T. A least upper bound of 
T in S is an upper bound b such that if c is another upper bound, then 
b<c. We shall say that S is inductively ordered if every non-empty 
totally ordered subset has an upper bound. 

We shall say that S is strictly inductively ordered if every non-empty 
totally ordered subset has a least upper bound. 

In Examples 1, 2, 3, in each case, the set is strictly inductively ordered. 
To prove this, let us take Example 1. Let T be a non-empty totally 
ordered subset of the set of subgroups of G. This means that if H, H’ € T; 
then H cH’ or H’ CH. Let U be the union of all sets in T. Then: 


(1) U is a subgroup. Proof: If x, ye U, there exist subgroups H, 
H’ eT such that xe H and yedH’. If, say, H cH’, then both 
x, yéH’ and hence xyeH’. Hence xyeU. Also, x7! eH’, so 
x-'e€U. Hence U is a subgroup. | 

(2) U is an upper bound for each element of T: Proof: Every H € T 
is contained in U, so H S U for all He T. 

(3) U isa least upper bound for T. Proof: Any subgroup of G which 
contains all the subgroups H € T must then contain their union 
U. 


The proof that the sets in Examples 2, 3 are strictly inductively 
ordered is entirely similar. 

We can now state the property mentioned at the beginning of the 
section. 


Zorn’s Lemma. Let S be a non-empty inductively ordered set. Then 
there exists a maximal element in S. 


Zorn’s lemma could be just taken as an axiom of set theory. How- 
ever, it is not psychologically completely satisfactory as an axiom, be- 
cause its statement is too involved, and one does not visualize easily the 
existence of the maximal element asserted in that statement. We show 
how one can prove Zorn’s lemma from other properties of sets which 
everyone would immediately grant as acceptable psychologically. 

From now on to the end of the proof of Theorem 3.1, we let A be a 
non-empty partially ordered and strictly inductively ordered set. We re- 
call that strictly inductively ordered means that every non-empty totally 
ordered subset has a least upper bound. We assume given a map 
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f: A-A such that for all xe A we have x < f(x). We could call such 
a map an increasing map. 

Let ae A. Let B be a subset of A. We shall say that B is admissible 
if: 

(1) B contains a. 

(2) We have f(B) cB. 

(3) Whenever T is a totally ordered subset of B, the least upper 

bound of T in A lies in B. 


Then B is also strictly inductively ordered, by the induced ordering of A. 
We shall prove: 


Theorem 3.1 (Bourbaki). Let A be a non-empty partially ordered and 
strictly inductively ordered set. Let f: A— A be an increasing mapping. 
Then there exists an element x) € A such that f(Xo) = Xo. 


Proof. Suppose that A were totally ordered. By assumption, it would 
have a least upper bound be A, and then 


bs f(b) $5, 


so that in this case, our theorem is clear. The whole problem is to 
reduce the theorem to that case. In other words, what we need to find is 
a totally ordered admissible subset of A. 

If we throw out of A all elements xe A such that x is not = a, then 
what remains is obviously an admissible subset. Thus without loss of 
generality, we may assume that A has a least element a, that is a < x for 
all x € A. 

Let M be the intersection of all admissible subsets of A. Note that 
A itself is an admissible subset, and that all admissible subsets of A 
contain a, so that M is not empty. Furthermore, M is itself an admissi- 
ble subset of A. To see this, let xe M. Then x is in every admissible 
subset, so f(x) is also in every admissible subset, and hence f(x) eM. 
Hence f(M)c M. If T is a totally ordered non-empty subset of M, and 
b is the least upper bound of T in A, then b lies in every admissible 
subset of A, and hence lies in M. It follows that M is the smallest 
admissible subset of A, and that any admissible subset of A contained in 
M is equal to M. 

We shall prove that M is totally ordered, and thereby prove Theorem 
3.1. 

[First we make some remarks which don’t belong to the proof, but 
will help in the understanding of the subsequent lemmas. Since ae M, we 
see that f(a) e M, fo f(a) e M, and in general f"(a)e M. Furthermore, 


a< fla < fra)s--. 
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If we had an equality somewhere, we would be finished, so we may 
assume that the inequalities hold. Let D ) be the totally ordered set 
{f"(a)\,20- Then Dg looks like this: 


a < f(a) < fr(a<-:' < f"(ay<::. 


Let a, be the least upper bound of Do. Then we can form 


a, < flay) <f*(ai)<-* 


in the same way to obtain D,, and we can continue this process, to 
obtain 
D,, D2,.... 


It is clear that D,, D,, ... are contained in M. If we had a precise way 
of expressing the fact that we can establish a never-ending string of such 
denumerable sets, then we would obtain what we want. The point is that 
we are now trying to prove Zorn’s lemma, which is the natural tool for 
guaranteeing the existence of such a string. However, given such a string, 
we observe that its elements have two properties: If c is an element of 
such a string and x <c, then f(x) <c. Furthermore, there is no element 
between c and f(c), that is if x is an element of the string, then x <c or 
f(c) < x. We shall now prove two lemmas which show that elements of 
M have these properties. | 

Let ce M. We shall say that c is an extreme point of M if whenever 
x eM and x <c, then f(x) <c. For each extreme point ce M we let 


M, = set of xe M such that x <c or f(c) Sx. 
Note that M, is not empty because a is in it. 
Lemma 3.2. We have M, = M for every extreme point c of M. 


Proof. It will suffice to prove that M, is an admissible subset. Let 
xeM.. If x <c then f(x)<c so f(x)eM,. If x =c then f(x) = f(c) is 
again in M.. If f(c) <x, then f(c) <x S f(x), so once more f(x) € M,. 
Thus we have proved that f(M.) < M.. 

Let T be a totally ordered subset of M, and let b be the least upper 
bound of T in A. Since M is admissible, we have be M. If all ele- 
ments xe T are <c, then b<c and beM,. If some x€é T is such that 
f(c) <= x, then 

fl) sx sb, 


and so b is in M,. This proves our lemma. 


Lemma 3.3. Every element of M is an extreme point. 
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Proof. Let E be the set of extreme points of M. Then E is not empty 
because ae E. It will suffice to prove that E is an admissible subset. We 
first prove that f maps E into itself. Let ce E. Let xe M and suppose 
x < f(c). We must prove that 


F(x) S fC). 


By Lemma 3.2, M = M., and hence we have x < cc, or x =c, or f(c) Sx. 
This last possibility cannot occur because x < f(c). If x <c then 


f(x) Sc fo). 


If x =c then f(x) = f(c), and hence f(E) c E. 

Next let T be a totally ordered subset of E. Let b the least upper 
bound of T in A. We must prove that be E. Let xe M and x<b. 
We must show that f(x) <b. If for all ce E we have f(c) <x, then 
cS f(c) <x for all ce E, whence x is an upper bound for E, whence 
b<cand be E. Otherwise, since M, = M for all ce E, we must therefore 
have x Sc for some ce E. If x <c, then f(x) Sc <b, and if x =c, then 


I(x) = flQeE 


by what has already been proved, and so f(x) <b. This proves that 
be E, that E is admissible, and thus proves Lemma 3.3. 


We now see trivially that M is totally ordered. For let x, ye M. 
Then x is an extreme point of M by Lemma 3.3, and ye M, so ySx or 


xSf(x) Sy, 


thereby proving that M is totally ordered. As remarked previously, this 
concludes the proof of Theorem 3.1. 

We shall obtain Zorn’s lemma essentially as a corollary of Theorem 
3.1. We first obtain Zorn’s lemma in a slightly weaker form. 


Corollary 3.4. Let A be a non-empty strictly inductively ordered set. 
Then A has a maximal element. 


Proof. Suppose that A does not have a maximal element. Then for 
each x € A there exists an element y, ¢ A such that x < y,. Let f: A> A 
be the map such that f(x) = y, for all xe A. Then A, f satisfy the hypoth- 
eses of Theorem 3.1 and applying Theorem 3.1 yields a contradiction. 


The only difference between Corollary 3.4 and Zorn’s lemma is that in 
Corollary 3.4, we assume that a non-empty totally ordered subset has a 
least upper bound, rather than an upper bound. It is, however, a simple 
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matter to reduce Zorn’s lemma to the seemingly weaker form of Corol- 
lary 3.4. We do this in the second corollary. 


Corollary 3.5 (Zorn’s Lemma). Let S be a non-empty inductively 
ordered set. Then S has a maximal element. 


Proof. Let A be the set of non-empty totally ordered subsets of S. 
Then A is not empty since any subset of S with one element belongs to 
A. If X, Ye A, we define X < Y to mean X c Y. Then A is partially 
ordered, and is in fact strictly inductively ordered. For let T = {X;};-; be 
a totally ordered subset of A. Let 


Then Z is totally ordered. To see this, let x, ye Z. Then xeX; and 
ye X, for some i, je J. Since T is totally ordered, say X;< X;. Then x, 
ye X; and since X; is totally ordered, x S y or y< x. Thus Z is totally 
ordered, and is obviously a least upper bound for T in A. By Corollary 
3.4, we conclude that A has a maximal element X,. This means that Xp 
is a maximal totally ordered subset of S (non-empty). Let m be an upper 
bound for X, in S. Then m is the desired maximal element of S. For if 
xéS and m<-x, then X,v {x} is totally ordered, whence equal to Xq by 
the maximality of X,. Thus xe X, and x <m. Hence x = ™m, as was to 
be shown. | 


CHAPTER Il 


Topological Spaces 


This chapter develops the standard properties of topological spaces. Most 
of these properties do not go beyond the level of a convenient language. 
In the text proper, we have given precisely those results which are used 
very frequently in all analysis. In the exercises, we give additional results, 
of which some just give routine practice and others give more special 
results. To incorporate all this material in the text proper would be 
extremely oppressive and would obscure the principal lines of thought 
inherent in the basic aspects of the subject. The reader can always be 
referred to Bourbaki [Bo] or Kelley [Ke] for encyclopaedic treatments. 


ll, §1. OPEN AND CLOSED SETS 


Let X be a set. By a topology on X we mean a collection 7 of subsets 
called the open sets of the topology, satisfying the following conditions: 
TOP 1. The empty set and X itself are open. 
TOP 2. A finite intersection of open sets is open. 
TOP 3. An arbitrary union of open sets is open. 
Example 1. Let X be any set. If we define an open set to be the 


empty set or X itself, we have a topology on X, which is definitely not 
interesting. 


Example 2. Let X be a set, and define every subset to be open. In 
particular, each element of X constitutes an open set. Again we have a 
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topology, which is called the discrete topology on X. A space with the 
discrete topology is called a discrete space. It does not look as if this 
topology were any more interesting than that of Example 1, but in fact it 
does occur in practice. 


Example 3. Let X = R be the set of real numbers. Define a subset U 
of R to be open if for each point x in U there exists an open interval J 
containing x and contained in U. The three axioms of a topology are 
easily verified. This topology is called the ordinary topology. 


Example 4. Generalization of Example 3, and used very frequently in 
analysis. We recall that a normed vector space (over the real numbers) 1s 
a vector space E together with a function on E denoted by x+»+|x| (real 
valued) such that: 


NVS 1. We have |x| 20 and = 0 if and only if x = 0. 
NVS 2. If ce R and x € E, then \cx| = |c||x\. 
NVS 3. If x, ye E, then |x + y| S |x| 4+ |yl. 


Similarly, one defines the notion of normed vector space over the 
complex numbers. The axioms are the same, except that we then take 
the number c to be complex in NVS 2. | 

By an open ball B in E centered at a point v, and of radius r > 0, we 
mean the set of all x ¢ E such that |x — v]| <r. We denote such a ball by 
Biv). We define a set U to be open in E if for each point re U there 
exists an open ball B centered at x and contained in U. Again it is easy 
to verify that this defines a topology, also called the ordinary topology of 
the normed vector space. It is but an exercise to verify that an open ball 
is indeed an open set of this topology. 

Let {x,} be a sequence in a normed vector space E. This sequence is 
said to be Cauchy if given ¢ (always assumed > Q) there exists N such 
that for all m, n = N we have 


\Xm — Xn| < & 


This sequence is said to converge to an element x if given «, there exists 
N such that for all n > N we have 


|x — x,| <6. 


Examples of Normed Vector Spaces 


The sup norm. Let S be a set. A map f:S—F of S into a normed 
vector space F is said to be bounded if there exists a number C > 0 such 
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that | f(x)| < C for all x eS. If f is bounded, define 


IFlls = IF = sup FOOL 


sup meaning least upper bound. It can be easily shown that the set of 
bounded maps B(S, F) of S into F is a vector space, and that || || is a 
norm on this space, called the sup norm. 


The L'-Norm. Let E be the space of continuous functions on [0, 1]. 
For f € E define 


Ifill =| P(x) dx. 


Then || ||, is a norm on E, called the L'-norm. This norm will be a 
major object of study when we do integration later, in a general context. 


Much of this book is devoted to studying the convergence of se- 
quences for one or the other of the above two norms. For instance, 
consider the sup norm. A sequence of maps {f,} is said to be uniformly 
Cauchy on S if given ¢ there exists N such that for all m,n > N we have 


fn — Smlls < €- 


It is said to be uniformly convergent to a map f if given ¢ there exists N 
such that for all n = N we have 


[tn — fills <6. 


In the second example, we would use the expressions L’-Cauchy and 
L'-convergent instead of uniformly Cauchy and uniformly convergent, if 
we replace the sup norm by the L'-norm in these definitions. 

Up to a point, one can generalize the notion of subset of a normed 
vector space as follows. Let X be a set. A distance function (also called 
a metric) on X is a map (x, y)t» d(x, y) from X x X into R satisfying the 
following conditions: | 


DIS 1. We have d(x, y)20 for all x, ye X, and = 0 if and only if 
x= y. 


DIS 2. For all x, y, we have d(x, y) = d(y, x). 
DIS 3. For all x, y, z, we have 


d(x, z) < d(x, y) + d(y, 2). 


A set with a metric is called a metric space. We can then define open 
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balls just as we did in the case of normed vector spaces, and also define 
a topology in a metric space just as we did for a normed vector space. 
Every open set is then a union of open balls. This topology is said to be 
determined by the metric. 

In a normed vector space, we can define the distance between elements 
x, y to be d(x, y) = |x — y|. It is immediately verified that this is a metric 
on the space. Conversely, the reader will see in Exercise 5 how a metric 
space can be embedded naturally in a normed vector space, in a manner 
preserving the metric, so that the “generality” of metric spaces is illusory. 
For convenience, we also make here the following definition: If A, B are 
subsets of a normed vector space, we define their distance to be 


d(A, B)=inf|x—yl, xeA,yeB. 


Basic theorems concerning subsets of normed vector spaces hold just as 
well for metric spaces. However, almost all metric spaces which arise 
naturally (and certainly all of those in this course) occur in a normed 
vector space with a natural linear structure. There is enough of a change 
of notation from |x — y| to d(x, y) to warrant carrying out proofs with 
the norm notation rather than the other. 

Let Z and 7’ be topologies on a set X. One verifies at once that 
they are equal if and only if the following condition is satisfied: For each 
xe X and each set U open in Y containing x, there exists a set U’ 
open in 7’ such that xe U’ c U, and conversely, given U’ open in 7’ 
containing x, there exists U open in 7 such that xe UcU’. 


Example. The reader will verify easily that two norms | |, and | |, on 
a vector space E give rise to the same topology if and only if they satisfy 
the following condition: There exist C,, C, > 0 such that for all x e E we 
have 
Ci x], S [x12 S Cx. 


If this is the case, the norms are called equivalent. 
Just to fix terminology, we define the closed ball centered at v and of 
radius r = 0 to be the set of all x e E such that 


|x —ovl| Sr. 


We define the sphere centered at v, of radius r, to be the set of points x 
such that 
|x —vl =r. 


Warning. In some books, what we call a ball is called a sphere. This 
is not good terminology, and the terminology used here is now essen- 
tially universally adopted. 
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Examples of normed vector spaces are given in the exercises. The 
standard properties of subsets of normed vector spaces having to do with 
limits are also valid in metric spaces (cf. Exercise 5). We can define balls 
and spheres in metric spaces just as in normed vector spaces. We can 
also define the notion of Cauchy sequence in a metric space X as usual 
(again cf. Exercise 5), and X is said to be complete if every Cauchy 
sequence converges, 1.e. has a limit in X. 


Example 5. Let G be a group. We define a subset U of G to be open 
if for each element x € U there exists a subgroup H of G, of finite index, 
such that xH is contained in U. It is a simple exercise in algebra to 
show that this defines a topology, which is called the profinite topology. 


Example 6. Let R be a commutative ring (which according to stan- 
dard conventions has a unit element). We define a subset U of R to be 
open if for each xe U there exists an ideal J in R such that x+J is 
contained in U. It is a simple exercise in algebra to show that this 
defines a topology, which is called the ideal topology. 


Note. The topologies of Examples 5 and 6 will not occur in any 
significant way in this course, and may thus be disregarded by anyone 
uninterested in this type of algebra. 


A set together with a topology is called a topological space. In this 
chapter we develop a large number of basic trivialities about topological 
spaces, and except for the numbered theorems, it is recommended that 
readers work out the proofs for all other assertions by themselves, even 
though we have given most of them. 

The duality between intersections and unions with respect to taking 
the complement of a subset allows us to define a topology by means of 
the complements of open sets, called closed sets. In any topological 
space, the closed sets satisfy the following conditions: 


CL 1. The empty set and the whole space are closed. 
CL 2. The finite union of closed sets is closed. 


CL 3. The arbitrary intersection of closed sets is closed. 


The first condition is clear, and the other two come from the fact that 
the complement of the union of subsets is equal to the intersection of 
their complements, and that the complement of the intersection of subsets 
is equal to the union of their complements. 

Conversely, given a collection # of subsets of a set X (not yet a 
topological space), we say that it defines a topology on X by means of 


22 TOPOLOGICAL SPACES (II, §1] 


closed sets if its elements satisfy the three conditions CL 1, 2, 3. We can 
then define an open set to be the complement of a set in F¥. 


Example 7. Let X = R". Let f(x,,...,x,) be a polynomial in n vari- 
ables. A point a =(a,,...,a,) in R” is called a zero of f if f(a)=0. We 
define a subset S of R" to be closed if there exists a family {f,};-, of 
polynomials in n variables (with real coefficients) such that S consists 
precisely of the common zeros of all f; in the family (in other words, all 
points aéR” such that f(a) =0 for all i). The reader may assume here 
the result that, for any such closed set S, there exists a finite number of 
polynomials f,, ...,f, such that S is already the set of zeros of the set 
{f,,...f,}. It is easy to prove that we have defined a topology by means 
of closed sets, and this topology is called the Zariski topology on R”. It 
is a topology which is adjusted to the study of algebraic sets, that 1s sets 
which are zeros of polynomials. It will not reappear in this course, and 
again a disinterested reader may omit it. It does become important in 
subsequent courses, however. In 2-space, a closed set consists of a finite 
number of points and algebraic curves. In 3-space, a closed set consists 
of a finite number of points, algebraic curves, and algebraic surfaces. 


Let X be a topological space, and S a subset. A point x € X is said to 
be adherent to S if given an open set U containing x, there is some point 
of S lying in U. In particular, every element of S is adherent to S. A 
point of X is called a boundary point of S if every open set containing 
this point also contains a point of S and a point not in S. Thus an 
adherent point of S which does not lie in S is a boundary point of S. An 
interior point of S is a point of S which does not lie in the boundary of 
S. The set Int(S) of interior points of S is open. 


A subset S of X is closed if and only if it contains all its boundary 
points. This follows at once from the definitions. 


By the closure of a subset S of X we mean the union of S and all its 
boundary points. The closure of S, denoted by S, is therefore the set of 
adherent points of S. It is also immediately verified that S is closed, and 
is equal to the intersection of all closed sets containing S. In particular, 
we have _ 

S=S. 


As an exercise, the reader should prove that for subsets S, T of X we 
have: 


SUT=SUT © and SATCSAT. 


Equality does not necessarily hold in the formula on the right. 
(Example?) 
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A subset S of a space X is said to be dense (in X) is S = X. For 
instance, the rationals are dense in the reals. 

Let X be a topological space and S a subset. We define a topology 
on S by prescribing a subset V of S to be open in S if there exists an 
open set U in X such that V=UnaS. The conditions for a topology 
on S are immediately verified, and this topology is called the induced 
topology. With this topology, S is called a subspace. 


Note. A subset of S which is open in S may not be open in X. For 
instance, the real line is open in itself, but definitely not open in R?. 
Similarly for closed sets. On the other hand, if U is an open subset of X, 
then a subset of U is open in U in the induced topology if and only if it 
is open in X. Similarly, if S is a closed subset of X, a subset of S is 
closed in S if and only if it 1s closed in X. 


If P is a certain property of certain topological spaces (e.g. connected, 
Or compact as we shall define later), then we say that a subset has 
property P if it has this property as a subspace. 

A topology on a set is often defined by means of a base for the open 
sets. By a base for the open sets we mean a collection @ of open sets 
such that any open set U is a union (possibly infinite) of elements of &. 
There is an easy criterion for a collection of subsets to be a base for a 
topology. Let X be a set and # a collection of subsets satisfying: 


B 1. Every element of X lies in some set in &. 


B 2. If B, B’ are in B and x € BB’ then there exists some B” in & 
such that x € B” and BY < BOB’. 


If Z satisfies these two conditions, then there exists a unique topology 
whose open sets are the unions of sets in @. Indeed, such a topology is 
uniquely determined, and it exists because we can define a set to be open 
if it is a union of sets in Z. The axioms for open sets are trivially 
verified. 


Example. The open balls in a normed vector space form a base for 
the ordinary topology of that space. 


Example. Let X be a set and let Y, WY be topologies on X, that is 
collections of open sets satisfying the axioms for a topology. We say that 
WY is a refinement of Y, or that Y is coarser than 7, if every set open in 
U is also open in VY. Thus % has fewer open sets than VW (“fewer” in the 
weak sense since &@ may be equal to Vv). 


Let Y be a topological space and let ¥ be a family of mappings 
f:X — Y of X into Y. Let # be the family of all subsets of X consisting 
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of the sets f~'(W), where W is open in Y and f ranges over ¥. Then 
we leave to the reader the verification of the following facts: 


1. Z is a base for a topology on X, ie. satisfies conditions B 1, B 2. 


2. This topology is the coarsest topology (the one with the fewest 
open sets) such that every map f € ¥ is continuous. 


We call this topology the weak topology on X determined by ¥. 

For an application of the weak topology, see Chapter IV, §1 and also 
the appendix of Chapter IV. 

There is a generalization of the weak topology as follows. Instead of 
considering one space Y, we consider a family of spaces {¥,}, for i 
ranging in some index set. We let ¥ be a family of mappings f;: X > Y,. 
We let # be the family of all subsets of X consisting of finite intersec- 
tions of sets f,'(U,;) where U, is open in ¥. Then again it is easily 
verified that # is a base for a topology, called the weak topology deter- 
mined by the family #. The product topology defined below will provide 
an example of this more general case, when the family F is the family of 
projections on the factors of a product. 


A topological space is said to be separable if it has a countable base. 
(By countable we mean finite or denumerable.) Exercises on separable 
spaces designed to acquaint the reader with them, and essentially all 
trivial, are given at the end of the chapter. It is easy to see that the real 
numbers have a countable base. Indeed, we can take for basis elements 
the open intervals of rational radius, centered at rational points. Simi- 
larly, R” has a countable base. 


Note. In most cases, the property defining separability is equivalent 
with the property that there exists a countable dense subset (cf. Exercise 
15), and this second property is sometimes used to define separability. 
We find our definition to be more useful but the reader is warned on the 
discrepancy with some other texts. 


An open set containing a point x is called an open neighborhood of 
this point. By a neighborhood of x we mean any set containing an open 
set containing x. In a normed vector space, one speaks of an ¢-neighbor- 
hood of a point x as being a ball of radius ¢ centered at x. 

Let X, Y be topological spaces. A map f: X - Y is said to be contin- 
uous if the inverse image of an open set (in Y) is open in X. In other 
words, if V is open in Y then f~*(V) is open in X. Equivalently, we see 
that a map f is continuous if and only if the inverse image of a closed 
set is closed. 
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Proposition 1.1. Let E, F be normed vector spaces and let f: EF be 
a map. This map is continuous if and only if the usual (¢, 0) definition is 
satisfied at every point of E. 


We prove one of the two implications. Assume that f is continuous 
and let xe E. Given ¢, let V be the open ball of radius ¢ centered at 
f(x). The open set U = f7'(V) contains an open ball B of radius 6 
centered at x for some 0. In particular, if ye E and |x — y| <0, then 
f(y)e V and | f(y) — f(x)| < ¢. This proves the (¢, 6) property. The con- 
verse is equally clear and is left to the reader. 


Actually, this (e,6) property can be formulated analogously in arbi- 
trary topological spaces, as follows: The map f: X > Y is said to be 
continuous at a point x e X if given a neighborhood V of f(x) there exists 
a neighborhood U of x such that f(U) c V. It is then verified at once 
that f is continuous if and only if it is continuous at every point. 


Proposition 1.2. Let X be a metric space (or a subset of a normed 
vector space) and let f: X +E be a map into a normed vector space. 
Then f is continuous if and only if the following condition is satisfied. 
Let {x,} be a sequence in X converging to a point x. Then { f(x,)} 
converges to f(x). 


The proof will be left as an exercise to the reader. 
A composite of continuous maps is continuous. 


Indeed, if f: X > Y and g: Y— Z are continuous maps and V is open 
in Z, then 


(goof) *(VW=f(g"(V)) 
is seen to be open. 

As usual, we observe that a continuous image of an open set is not 
necessarily open. 

A continuous map f: X — Y which admits a continuous inverse map 
g: YX is called a homeomorphism, or topological isomorphism. It is 
clear that a composite of homeomorphisms is also a homeomorphism. 
As usual, we observe that a continuous bijective map need not be a 
homeomorphism. In fact, later in this course, we meet many examples 
of. vector spaces with two different norms on them such that the identity 
map is continuous but not bicontinuous. 

Let {X;,};-,; be a family of topological spaces and let 


X={[| X; 


iel 
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be their product. We define a topology on X, called the product topol- 
ogy, by characterizing a subset U of X to be open if for each x € U there 


exists a finite number of indices i,, ...,i, and open sets U;,, ...,U;, in the 
spaces X; , ...,X; respectively such that 
xeU,, x ++ x Ux [] X,cU. 
"iki 
The product for i #i, is taken for all indices i unequal to i,, ...,i,. In 


other words, we can say that the product topology is the one having as a 
base all sets of the form 


U;, x °° KK U;. x I] X;. 


ix i, 


Such sets have arbitrary open sets at a finite number of components, and 
the full space at all other components. 

The product topology is the unique topology with the fewest open sets 
in X which makes each projection map 


™;: X > X; 
continuous. Indeed, for each open set U; in X;, the set 


7; '(U;) = U; x Il Xi, 


i#Jj 


must be open if z, is continuous, and our previous assertion follows. In 
other words, it is the weak topology determined by the family of all 
projections on the factors. 

More generally, given a set and a family of mappings of this set into 
topological spaces, one can define a unique topology on the set making 
all these mappings continuous, and having the fewest open sets doing 
this, namely the weak topology. If S is a set, and 


i Ss hier 


is a family of maps into topological spaces Y,, then the map 


f9TLy, 


iel 
such that f(x) = { f,(x)} is continuous for this topology. 


Example 8. We can give R” the product topology, which is called the 
ordinary topology. We define the sup norm on R" by 


||| = max|x;| 


(II, §2] CONNECTED SETS 27 


if x = (x,,...,X,) iS given in terms of its coordinates. Then the topology 
determined by this norm is clearly the same as the product topology. 


Remark. A map f: X ~ Y which maps open sets onto open sets is 
said to be open. A map which maps closed sets onto closed sets is said 
to be closed. A continuous map need not be either. For instance, the 
graph of the tangent is closed in the plane, but the projection map on 
the x-axis maps it on an open interval: 


Figure 2.1 


The map which folds the plane over the real axis maps the open plane 
on the closed half plane. If f: X — Y is continuous and bijective, then a 
necessary and sufficient condition that f be a homeomorphism is that f 
be open. This is simply a rephrasing of the continuity of the inverse 


mapping f™. 


ll, §2. CONNECTED SETS 


A topological space X is said to be connected if it is not possible to 
express X as a union of two disjoint non-empty open sets. Of course, we 
can formulate the definition in terms of closed sets instead of open sets. 

The reader’s intuition of connectedness probably comes from the pos- 
sibility of connecting two points of a set by a path. We shall discuss the 
relation between this notion and the general notion later, after developing 
first some basic properties of connected sets. 


Proposition 2.1. Let f: X —- Y be a continuous map. If X is connected 
then the image of X is connected. 
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Proof. Without loss of generality we may assume that Y is the image 
of f. Suppose that Y is not connected, so that we can write Y= UUV 
where U, V are open, non-empty, and disjoint. Then 


xX =f(U)Uf"(V), 
which is impossible. This proves our assertion. 


Proposition 2.2. A topological space X is connected if and only if every 
continuous map of X into a discrete space having at least two elements 
is constant. 


Proof. Assume that X is connected, and that f is a continuous map of 
X into a discrete space with at least two elements. If f is not constant, 
we can write the image of f as a union of two disjoint non-empty sets, 
open by definition, and this contradicts our previous result. Conversely, 
suppose that we can write X = UU V as a disjoint union of non-empty 
open sets. Let p, q be two distinct objects and let the set {p,q} have the 
discrete topology. If we define 


f:X > {p, q} 


to be the map such that 


fU)={p} and f(V) = {gq}, 
then f is continuous and not constant, as was to be shown. 


Observe that our proof shows that instead of taking a discrete space 
having at least two points, we can take a space with exactly two points 
in characterizing a connected set, as we have just done. 


Proposition 2.3. Let X be a topological space and let {S;;-; be a 
family of subspaces which are connected. If they have a point in com- 
mon then their union is connected. 


Proof. Let a lie in the intersection of all S;. If we can write 
|) 8,=UvY, 
where U, V are open in this union, then S$;4 U and S;  V are open in S; 
for each i and hence S,;c U or S;c V. If for some i we have S;< U, 


then ae U and consequently we must have S, < U for all i, thus proving 
our assertion. 


[II, §2] CONNECTED SETS 29 


As a consequence of the preceding statement, we define the connected 
component of a point a in X to be the union of all connected subspaces 
of X containing a. This component is actually not empty, because the 
set consisting of a alone is connected. 


Proposition 2.4. Let X be a topological space and S a connected subset. 
Then the closure of S is connected. In fact, if Sc TCS, then T is 
connected. 


Proof. Left to the reader. 

Corollary 2.5. The connected component of a point is closed. 

Proof. Clear. 

As promised, we now discuss the relation between the naive notion of 
connectedness and the general notion. Let X be a topological space. We 
say that X is arcwise connected if given two points x, y in X there exists 
a piecewise continuous path from x to y. By a piecewise continuous path, 
we mean a sequence of continuous maps {a,,...,0,$, where each 

a;: La;, b;] + X 
is a continuous map defined on a closed interval [a;, b, | such that 
a;(b;) = 0:41 (4;41). 
We say that this path goes from x to y if 
a,(a,)=x and a.(b,) = y. 
Of course, if such a path exists, then it is easy to define just one continu- 
ous map 


a: [a,b] >~xX 


from some interval [a,b] into X such that «(a)=x and a(b)=y. One 
can even take the interval [a, b] to be [0, 1]. 


Proposition 2.6. Any interval of real numbers is connected. 


Proof. We give the proof for a closed interval J = [a, b] and leave the 
other cases (open, half-open, infinite intervals) as exercises. Suppose that 
we can write J = AUB where A, B are closed, disjoint, and non-empty. 
Say that ae A. Let c be the greatest lower bound of B. Then c lies in 
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the closure of B and since B is closed, ce B, so c#a. For any xeJ 
with a<x<c, we must have xeA since c is a lower bound for B. 
Since A is closed, and since c lies in the closure of the interval a< x <c, 
it follows that c lies in A, a contradiction which proves our assertion. 


Proposition 2.7. If a topological space is arcwise connected, then it is 
connected. 


Proof. Let X be arcwise connected and suppose that we can write X 
as a disjoint union of non-empty open sets U, V. Let xe U and ye V. 
There exists a continuous map «:J—X from a closed interval into 
X starting at x and ending at y. Then a ‘(U) and «a ‘*(V) express J 
as a disjoint union of non-empty disjoint sets which are open in J, a 
contradiction. 

The converse of the preceding result is false. For instance the subset of 
the plane consisting of the y-axis and the graph of the curve y = sin(1/x) 
is connected but not arcwise connected. In practice, however, most ordi- 
nary sets which are connected are also arcwise connected, and the sort of 
pathology which arises from sin(1/x) is just that: pathology. In Exercise 
12, you will prove that an open subset of a normed vector space is 
connected if and only if it is arcwise connected. 


Theorem 2.8. Let {X;};.,; be a family of connected topological spaces. 
Then the product 
X = I] X; 
ie] 


is connected. 


Proof. Let f: X > {p,q} be a continuous map of X into a discrete 
space consisting of two points. We must show that f is constant. Let 
ae X and say that f(a)=p. Then f~‘(p) contains an open neighbor- 
hood of a of the form 


U=U; x°::x U; 


Let b be any other point of X and write a, b in terms of their 
coordinates: 
A= (G;,,..-,4;,,-++)s 
b =(b,,,...5b;,,..-). 
Let 
a= (a;,, +++; , (b,):4i,,....i,) 


so that the coordinates of z are the same as those of a for i,, ...,i, and 
the same as those of b for the other indices. Then ze U and f(z) =p. 
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Consider the composite of maps 
Xi, = x 4 {P, q}. 
where g is the injective mapping such that 


g(x;,) = (x;, 9 Aj, +++A; » (D;);4i,,....i,)* 


Then g is continuous, so is fog, and since the continuous image of a 
connected set is connected, it follows that fog is constant on X;,. In 
particular, fo g(a;,) = f(z) = p, and also 


f(b;,; Gi,o+++Gi, > (b;); 43, pees i) = Pp. 


We now perform the same trick, replacing a;, by 5,,, ..., and a;, by ,.. 
We then see that f(b) = p, thus proving that f is constant, which proves 
the theorem. 


Corollary 2.9. Euclidean n-space R" is connected, and so is the product 
of any number of intervals. 
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Let X be a set and {S,},-, a family of subsets. We say that this family 
is a covering of X if its union is equal to X. If X is a topological space, 
and {U,},-4 1S a covering, we say it is an open covering if each U, is 
open. If {S,},-4 is a covering of X, we define a subcovering to be a 
covering {Sg}s-g3 Where B is a subset of A. In particular, a finite sub- 
covering of {S,} is a covering {S,,,...,S,, }. 

Let X be a topological space. We shall say that X is compact if any 
open covering of X has a finite subcovering. As usual, we can express a 
dual condition relative to closed sets. Let {F,},-, be a family of subsets 
of X. We say that this family has the finite intersection property if any 
finite intersection 


is not empty. 


Proposition 3.1. A topological space X is compact if and only if, for 
any family {F,\,-4 Of closed sets having the finite intersection property, 
the intersection 

(\ F 


aeA 
is not empty. 


32 TOPOLOGICAL SPACES [II, §3] 


Proof. Assume that X is compact and let {F,} be a family of closed 
sets having the finite intersection property. Suppose that the intersection 
of this family is empty. Then the complements @F, form an open cover- 
ing of X, and there is a finite subcovering by open sets {@F,,,...,@F, }. 
Taking the complement, we conclude that the intersection 


F, Orta F, ; 
is empty, which is a contradiction, thus proving the finite intersection 
property. The converse is equally clear. 


Proposition 3.2. A continuous image of a compact set is compact. 


Proof. Let X be compact, and let f: X > Y be a continuous map, 
which is surjective. Let {V,} be an open covering of Y. Then {f~*(V,)} is 
an open covering of X, and there is a finite subcovering 


{f* Va) of (UG, F 
It follows that {V,,,...,¥,.} is a covering of Y, as was to be shown. 
Proposition 3.3. A closed subspace of a compact space is compact. 


Proof. Let X be a compact space and S a closed subspace. Let {U,} 
be a covering of S by open sets in X. Let U be the complement of S in 
X. Then {U,} together with U form an open covering of X, having a 
finite subcovering 

{U, 


ao ++9Uy,, US. 
Since U is disjoint from S, it follows that already U, 
thus proving our assertion. 

The converse of the preceding assertion is almost true but not quite. 
A topological space X is said to be Hausdorff if given points x, ye X 
and x # y there exist disjoint open sets U, V such that xe U and ye V. 
If X is Hausdorff, then each point of X is obviously closed. 


...5U, cover S, 


1? 


Proposition 3.4. A compact subspace of a Hausdorff space is closed. 


Proof. Let S be a compact subset of the Hausdorff space X. We 
prove that its complement is open. Let x be in the complement. For 
each yeS there exist disjoint open sets U,, V, such that xeU, and 
yeV,. The family {V,},.s covers S and there is a finite subcovering 
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Then the intersection U, -:: OU, is open, contains x, and is contained 
in the complement of S, thus proving what we want. 

A topological space X is said to be normal if it is Hausdorff, and if 
given two disjoint closed sets A, B in X there exist disjoint open sets U, 
V such that Ac U and Bc V. 


Proposition 3.5. A compact Hausdorff space is normal. In fact, if A, B 
are compact subsets of a Hausdorff space, and are disjoint, there exist 
disjoint open sets U, V such that Ac U and Bc V. 


Proof. The proof is similar to the previous one, and involves merely 
one further application of the same principle. Using the same trick as in 
this previous proof, we know that for each x € A there exist disjoint open 
sets U,, W. such that xe U, and Bc W,. (One would take the finite 
union of the open sets V,, ...,V, to obtain W, in the analogous situa- 
tion.) The family of open sets {U,},., covers A, and there exists a finite 
subcovering 


{U,,,, ...,U,, }- 
The open sets U,, UU U, and W..0::-AW,, solve our problem. 


In the case of Hausdorff spaces, or normal spaces, we say also that 
points (or closed sets) can be separated by open sets. The properties of 
being Hausdorff or normal are thus called separation properties. 

It is clear that a subspace of a Hausdorff space is Hausdorff. The 
analogous statement for normal spaces is not necessarily true (cf. Kelley 
[Ke], Exercise F, p. 132). 

The general notion of a compact space is, in many practical cases, 
equivalent with another notion with which the reader is probably already 
familiar. We call a space X sequentially compact if it has the Weierstrass— 
Bolzano property, namely every sequence {x,} in X has a point of accu- 
mulation (a point c such that given an open neighborhood U of c, there 
exist infinitely many n such that x, ¢U). As usual, an equivalent condi- 
tion is that an infinite subset of X has a point of accumulation. It is an 
exercise to prove: | 


Proposition 3.6. If a topological space has a countable base, then it is 
compact if and only if it is sequentially compact. 
(Cf. Exercise 19.) 


The preceding criterion will not be used in this book. 


Proposition 3.7. Compactness implies sequential compactness. 
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Proof. Let X be compact. It will suffice to prove that an infinite 
subset of X has a point of accumulation. Suppose that this is not the 
case, and let S be an infinite subset. Given x € X, there exists an open 
set U, containing x but containing only a finite number of the elements 
of S. The family {U,},-x covers X. Let {U,,,...,U,.} be a finite sub- 
covering. We conclude that there is only a finite number of elements of S 
lying in the finite union 

U, Ut UU, . 
This is a contradiction, which proves our assertion. 

The converse is true under important and rather general conditions, as 
shown in the next theorem. 


Theorem 3.8. Let S be a subset of a metric space, or of a normed 
vector space. 


(i) S is compact if and only if S is sequentially compact. 
(ii) S is compact if and only if S is complete, and given r > 0 there 
exists a finite number of open balls of radius r which cover S. 


Proof. We have already proved that compactness implies sequential 
compactness. Conversely, assume that S is sequentially compact. Then 
certainly S is complete, and we shall prove that the other condition 
stated in (ii) is satisfied. Suppose it is not. Let r>0. Let x, eS and let 
B, be the open ball of radius r centered at x,. Then B, does not contain 
S, and there is some x,€S, x,¢B,. Proceeding inductively, suppose 
that we have found open balls B,, ...,B, of radius r, and points x,, 
...,X, With x;é B; such that x,,, does not he in B,U:::UB,. We can 
then find x,,, which does not lie in B, U---U B,, and we let B,,, be the 
open ball of radius r centered at x,,,. Let v be a point of accumulation 
of the sequence {x,}. By definition, there exist positive integers m, k with 
k >m such that 


|x, — v| < r/2 
and 
[Xm — v| < 7/2. 


Then |x, — x,,| <r and this contradicts the property of our sequence {x,} 
because x, lies in the ball B,. This proves that S satisfies the condition 
of (ii). 


Now assume this condition. Let {U,};-; be an open covering of S, and 
suppose that there is no finite subcovering. We construct a sequence 
{x,} in S inductively as follows. We know that S is covered by a finite 
number of closed balls of radius 4. Hence there exists at least one closed 
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ball C, of radius 5 such that C, 7S is not covered by a finite number of 
U,. We let x, be a point of C, 1S. Suppose that we have obtained a 
sequence of closed balls 


C, 2° 3G, 


such that C, has radius 1/2", with a point x,¢C,S, and such that 
C,S is not covered by a finite number of U;. Since S itself can be 
covered by a finite number of closed balls of radius 1/(2"**), it follows 
that C,-S can also be so covered, and hence there exists a closed ball 
C,+, of radius 1/(2"**) and such that C,,,S cannot be covered by a 
finite number of U,;. We let x,,, be a point of C,,, 0S. This constructs 
our sequence as desired. We see that {x,} is a Cauchy sequence in S, 
which coverges to a point x in S. But x lies in some U;, which contains 
C, for all sufficiently large n, a contradiction which proves our theorem. 


A subset S of a metric space, or a normed vector space, which can be 
covered by a finite number of open balls of given radius r > 0 is said to 
be totally bounded. We can phrase (ii) by saying that S is compact if and 
only if it is complete and totally bounded. A subset of a topological 
space is said to be relatively compact if its closure is compact. From (ii) 
we get a convenient criterion for relative compactness. 


Corollary 3.9. Let S be a subset of a complete normed vector space. 
Assume that given r >0 there exists a finite covering of S by balls of 
radius r. Then S is relatively compact. 


Proof. The closure S of S has the same property, because if S is 
covered by a finite number of balls of radius r/2, then the closure of S is 
covered by a finite number of balls of radius r (centered at the same 
points). Also S is complete. Hence we conclude that the closure of S is 
compact. 


As an application of Theorem 3.8, we recall that a closed (bounded) 
interval in R has the Weierstrass—Bolzano property. Hence it is compact, 
and therefore so is any closed bounded subset of R (being a closed subset 
of a compact set). The converse is also true, since a compact set is 
closed, and must be bounded, otherwise one can find an infinite sequence 
tending to infinity, and not having a point of accumulation. 

One can also prove the compactness of a closed interval directly from 
the least upper bound axiom, as follows. Let a < b, and let {U;};.; be an 
open covering of [a,b]. Let S be the set of all x e [a, b] such that [a, x] 
admits a finite subcovering. Then S is not empty (because a eS) and is 
bounded from above by b. Let c be its least upper bound. Then ce U,, 
for some index i). If a<c, select a number t with a<t<c such that 
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the interval [t, c] is contained in U,,. If a= c, let t=a. Then [a, t] can 
be covered by a finite number of sets U;, say U;,, ....U,,. If c #5, then 
U,,, U;,, ...,U;, cover an interval [a, c’] with c’ > c, a contradiction, proving 
that c = b and that [a, b] 1s compact. 


One can generalize to arbitrary compact sets some standard theorems 
on closed intervals, e.g.: 


Proposition 3.10. Let A be a compact set, and f: A—R a continuous 
function on A. Then f has a maximum (a point ce A_ such that 


f(c) = f(x) for all x € A). 


Proof. The image f(A) is compact, so closed and bounded. The least 
upper bound of f(A) lies in f(A), thus proving our assertion. 


If A is a subset of a normed vector space, and if f:A—-F is a 
continuous map into some normed vector space F, then we say that f is 
uniformly continuous on A if given « there exists 6 such that whenever 
x, yE€A and |x — y| < 6, then | f(x) — f(y)| <6 We recall the theorem 
from elementary analysis that: 


Proposition 3.11. Let A be a compact subset of a normed vector space. 
If f: AF is a continuous map into a normed vector space, then f is 
uniformly continuous. In fact, if A is contained in a subset S of a 
normed vector space, if f is defined on S and continuous on A, then 
given & there exists 6 such that if xe A and yéS and |x — y| < 0, then 


If (x) — f(y) <e. 


We recall the proof briefly. Given «, for each x € A we let r(x) > 0 be 
such that if |y —x|< r(x), then | f(y) — f(x)|<« We can cover A by 
open balls B; of radius 


0; = r(x;)/2, 


centered at x, (i= 1,...,n). We let 6=mino,. If xe A, then for some i 
we have |x — x;| < r(x,)/2. If |y — x| < 6, then |y — x,| < r(x;) so that 


f(y) — FO) SIFY) — FO) + [FC) — FO)! 


< 2é, 
as was to be shown. 


The preceding definition of uniform continuity, and the result just 
proved, are of course valid for metric spaces, with the usual notation 
d(x, y) replacing |x — y|. The property which we proved, and which is 
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slightly stronger than uniform continuity on A, will be called relative 
uniform continuity (relative to S, that is). 


The only non-trivial theorem of this section is the theorem that a 
product of compact spaces is compact. In situations when one can use 
sequences, and one takes a finite product of spaces, however, the proof is 
immediate. For instance, let E, F be normed vector spaces, and let S, T 
be compact subsets of E, F, respectively. Let {z,} be a sequence in 
S x T, and write z, =(x,,y,) with x,¢E and y,eF. We can find a 
subsequence {x,,} converging to a point a in S. We can then find a 


subsequence Yn converging to a point b in F. Then the sequence 
(Zn, 3 converges to (a, b) so that S x T is sequentially compact. 


The idea for this proof is to project on the coordinates, and from 
coordinatewise convergence, get the convergence in the product space. 
However, if we do it for an infinite product, the above proof seems to fail 
because we may exhaust all the indices before being through with the 
proof. One can still formulate the basic idea so that it essentially carries 
over to the most general case. Part of the difficulty in doing this is that 
the points of accumulation in the various coordinate spaces are not 
uniquely determined. Thus one must find a set theoretic device which 
chooses simultaneously a point of accumulation in all coordinate spaces. 
The proof below is due to Bourbaki. 


Theorem 3.12 (Tychonoff’s Theorem). Let {X,},.,4 be a family of com- 
pact spaces. Then the product 


X= I] X 
aeA 
is compact. | 


Proof. Let ¥ = {F};-; be a family of closed subsets of the product, 
having the finite intersection property. The family of subsets of X (not 
necessarily closed) containing our given family ¥F and having the finite 
intersection property is ordered by ascending inclusion. One verifies im- 
mediately by taking the usual union that it is inductively ordered. It is 
therefore contained in a maximal family ¥* having the finite intersection 
property. Let 


Ty: X > X, 
be the projection on the a-th factor. For each «, the family of closed sets 
{2,(F)}, Fe F*, 


has the finite intersection property, and consequently there exists an 
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element x, in each set z,(F) for all Fe F*. Let x =(x,). We contend 
that x belongs to all sets Fe ¥*. This will prove our theorem. 

To prove our contention, we observe that the intersection of a finite 
number of sets in ¥* also lies in #* because of the maximality of F*. 
Let U be an open set of X containing x, of the form 


U =U,,x-:x U,. x [| X, 


axa; 


with each U,, open in X,. Then U,, contains x,, for all i, and therefore 
U,, contains a point of z,,(F) for all Fe A*. Hence 


;'(U,,) = Uy, x T] Xz 


aj 
aFa; 


contains a point of F for each Fe ¥*. Because of the maximality of #* 
with respect to the finite intersection property, it follows that 


1;;'(Uy,) 
belongs to #*, and hence the finite intersection of these sets for 
i=1l,...,n 


also belongs to #*. But this finite intersection is nothing else but our 
set U, and hence U intersects each F in ¥*, so a fortiori each Fe F. 
Hence x lies in the closure of each F € ¥, whence xe F for all Fe FJ, as 
was to be shown. 


Corollary 3.13. A subset of R" is compact if and only if it is closed and 
bounded. 


Proof. Let S be a subset of R” and assume first that S is closed and 
bounded. Then S is contained in the product of a finite number of 
closed intervals, and is therefore a closed subset of a compact space. It is 
thus compact. Conversely, if it is compact, it is closed, and it must be 
bounded; otherwise, one can find a sequence of elements in S going out 
to infinity, and not having a point of accumulation. 


Corollary 3.14. All norms on R" are equivalent. 
Proof. Let || || be the sup norm, and | | any other norm. It will 
suffice to prove that these two norms are equivalent. If e,, ...,e, are the 


usual unit vectors of R”, then for x = x,e, +°': + x,e, we get 


|x] S [xl les| +-++ + [xql lenl S Clx!| 
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with C = n-max|e;|. This proves one of the desired inequalities, and also 
shows that the other norm is continuous, because 


llxl — lyll S$ lx — yl S$ Cllx — yl. 


Let S, be the unit sphere centered at the origin for the sup norm. Then 
S, is closed and bounded, so compact, and the other norm has a mini- 
mum on S,, say at v. Thus for any x € R" we get 


2 |v|, andhence — |v|||x|| S$ |x. 


(xl 
This yields the other inequality, and proves our corollary. 


Using coordinates, we see that Corollary 3.14 also applies to a finite 
dimensional vector space. A closed subset of a complete metric space is 
complete, and a complete subset of a metric space is closed. We con- 
clude that a finite dimensional subspace of a normed vector space is 
complete, and therefore closed. 

A space X is said to be locally compact if every point has a compact 
neighborhood. For instance, R" is locally compact, and so is any finite 
dimensional vector space. It is clear that a normed vector space is locally 
compact if and only if the closed unit ball is compact. (If the space is 
locally compact, then some closed ball of radius r>0 is compact, and 
hence the unit ball is compact by multiplication with a positive number.) 


Corollary 3.15 (F. Riesz). A normed vector space is locally compact if 
and only if it is finite dimensional. 


Proof. Let E be a locally compact normed vector space, and let B be 
the closed ball of radius 1 centered at 0. We can find a finite number of 
points x,,...,x, € B such that B is covered by the open balls of radius 4 
centered at these points. We contend that x,, ...,x, generate E. Let F be 
the subspace generated by x,, ...,x,. Then F is finite dimensional, hence 
closed in E as a trivial consequence of Corollary 3.14. Suppose that xe E 
and xé F. Let 

d(x, F) = inf |x — yl. 


yeF 


Drawing a closed ball around x intersecting F, and using the fact that 
the intersection of F and this ball is compact, we conclude that there is 
some zéF such that d(x, F) =|x —z|, and we have x —z #0 since F is 
closed in E. Then there is some x; such that 


1 
2 


X—Z 
< 


Ix —z| 
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and consequently that 


Ix —2| 


|x —z—|x —z|x,| < 


However, z + |x — z|x; lies in F, and by definition of z such that 
d(x, F)=|x —2| 


we conclude that the left-hand side is = |x — z|. This is a contradiction 
which proves our corollary. 


Let X be a locally compact Hausdorff space. One can construct a 
compact space by adjoining to X a point “at infinity” as follows. Let p 
be some point not in X and let X’ be the union of X and {p}. We 
define a base of open sets in X’ by throwing into this base all subsets of 
X which are open in X, and the complements in X’ of compact sets in 
X. That this defines a base is clear, and one also verifies at once that X’ 
is then compact. It is called the one point compactification of X. 

It is easy to see that the one point compactification of R is homeo- 
morphic to a circle. The one point compactification of the plane R? is 
homeomorphic to the sphere. In general, the one point compactification 
of R" is homeomorphic to the n-sphere (i.e. the set of all xe R"** such 
that |x| = 1, where | | is the euclidean norm). 


I, §4. SEPARATION BY CONTINUOUS FUNCTIONS 


We are concerned throughout this section with a normal space X and 
the manner by which one can separate two disjoint closed sets by means 
of a continuous function. 


Lemma 4.1. Let X be a normal space. If A is closed in X and ACU 
is contained in an open set U, then there exists an open set U, such that 


AcU,<¢U,cU. 


Proof. Let B be the complement of U. By the definition of normality, 
there exist disjoint open sets U,, V, such that A c U, and Bc V,. It is 
clear that U, satisfies our requirements. 


Theorem 4.2 (Urysohn’s Lemma). Let X be a normal space and let A, 
B be disjoint closed subsets. Then there exists a continuous function f 
on X with values in the interval [0,1] such that f(A) = 0 and f(B) = 1. 
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Proof. In a metric space, which is the most important in practice, one 
can give a trivial proof. Cf. Exercise 7. We now give the proof in 
general. Let U, be the complement of B so that Ac U,. We find U,;, 
such that 


Ac Ui ¢ U1)2 c U,. 
We then find U,,, and U3, such that 
Ac Uy U1)4 CUS U1). Cc U34 ¢ Usy4 co U. 


Inductively, for each integer k with 0 Sk S 2", we find U,j. such that if 
r<s, then U.c U,c U,. We then define the function f by 


f(x) = 1 if xe B, 
f(x) = inf of all r such that x € U, if x € B. 


It is then essentially clear that f is continuous. We carry out the details. 
It will suffice to prove that for numbers a, b such that O<a<i and 
0 <b <1 the inverse images of the half-open intervals 


f~*[0, a) and f(b, 1] 


are open. In fact, we have 


f"D,a= UY, 


r<a 


because f(x) <a if and only if x lies in some U, with r <a. Similarly, we 
have f(x) > b if and only if x ¢ U, for some r > b, so that 


f(b, = YU @U,. 
r>b 
This proves our theorem. 


Since a compact Hausdorff space is normal. Urysohn’s lemma applies 
in this case. One needs it frequently in the locally compact case in the 
following form. 


Corollary 4.3. Let X be a locally compact Hausdorff space, and K a 
compact subset. There exists a continuous function g on X which is 1 
on K and which is equal to 0 outside a compact set. 
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Proof. Each x ¢K has an open neighborhood V. with compact clo- 
sure. A finite number of such neighborhoods V,, ...,V, covers K. Let 


VHKU UK. 
Then the closure of V is compact. There exists a continuous function 
g =0 on V (compact Hausdorff, hence normal) which is 1 on K and 0 
outside V, ie. 0 on VAGV. We define g to be 0 on the complement of V 
in X. Then g is continuous at every point in the complement of V, and 
as function on X is also continuous on V. This proves our corollary. 


Theorem 4.4 (Tietze Extension Theorem). Let A be a closed subset of a 
normal space X and let f be a continuous (real valued) function on A. 
Then there exists a continuous function f* on X whose restriction to A 
is equal to f. If f has values in [0,1], then we can choose f* to have 
values in [0, 1] also. 


Proof. Assume first that f has values in [0,1]. If A, B are disjoint 
closed subsets of X, we denote by g4, a function with values in [0, 1] 
such that g(A) = 0 and g(B) = 1. Such a function exists by Theorem 4.2. 


We shall now define functions f, on A and g, on X. 
We let fo = f and define sets Ay, By by the conditions: 


}, 
" 


We let go = $94,,8, and define f, = fo — go. Inductively, suppose that we 
have defined f,; we have 


Ay = {x eA such that f(x 


) 
By = {x € A such that f(x) 


A, = {x €A such that f,(x) S (3)(4)"}, 
B, = {x €A such that f,(x) 2 (4)(3)"}. 


We then define 
In = (3)(4)"94....B,, 


and let f.4, =f, —9,- (Here of course, we understand by g, its restric- 
tion to A.) Then in particular: 


Snar =f —(Go +°°° + Gn) 
We have 


(*) O<g,53(3)" and Of, S(3)" 
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The first inequality is clear. The second is proved by induction. It is 
clear for n =0. Let n>0. One distinguishes the three cases in which for 
a given xE€A we have xéEA,, or x€A, but x¢€B,, or xEB,. The 
desired inequality of f, is then obvious in each case, using the inductive 
hypothesis. 

From our inequalities (*), we then conclude that the series 


Got git +Gn4°" 


converges pointwise, and furthermore converges to f on A. The uniform 
bounds imply at once that the limit function is continuous, thus proving 
Theorem 4.4, when f has values in [0, 1]. 


Remark 1. The restriction to the interval [0,1] is of course unneces- 
sary, and the theorem extends at once to any other closed bounded 
interval, for instance by mapping such an interval linearly on [0, 1]. 


Now suppose that f is unbounded. Using the arctangent map we 
reduce the theorem to the case when f takes values in the open interval 
(—1, 1) and we must then know that the extension can be so chosen that 
its values also lie in the open interval (—1,1). Let B be the closed 
set where the extension f* (which we have constructed with values in 
[—1,1]) takes on the values 1 or —1. Then A and B are disjoint, so 
that by Urysohn’s lemma there exists a continuous function h on X with 
values in [0,1] such that h is 1 on A and O on B. Then hf* has values 
in the open interval (—1,1), as desired. This concludes the proof of 
Theorem 4.4. 


Remark 2. The theorem also holds in the complex case dealing sepa- 
rately with the real and imaginary parts. The extra condition on the 
restriction of the values can then be formulated analogously by requiring 
that 


If" SFI 


Indeed, suppose that we have extended f to a bounded continuous com- 
plex valued function g. Let b= ||f||. Let h be the function such that 
h(z) =z if |z| <b, and h(z) = bz/|z| if |z|}>b. Then h is continuous, 
|h\| <b, and hog fulfills our requirement. 
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1. (a) Let X, Y be compact metric spaces. Prove that a mapping f: X > Y is 
continuous if and only if its graph is closed in X x Y. 

(b) Let Y be a complete metric space, and let X be a metric space. Let A be 

a subset of X. Let f: A— Y be a mapping that is uniformly continuous. 


Ad 


2. 


. (a 
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Let A be the closure of A in X. Show that there exists a unique extension 
of f to a continuous map f: A— Y, and that f is uniformly continuous. 
You may assume that X, Y are subsets of a Banach space if you wish, 
in order to write the distance function in terms of the absolute value sign. 


Seminorms. Let E be a vector space. A function o:E—->R 1s called a 
seminorm if it satisfies the same conditions as a norm except that we allow 
a(x) =0 without necessarily having x =0. In other words, o satisfies the 
following conditions: 


SN 1. We have o(x) 2 0 for all x € E. 
SN 2. If x € E and a is a number, then o(ax) = |a| a(x). 
SN 3. We have o(x + y) S o(x) + o(y) for all x, ye E. 


We also denote a seminorm by the symbols | |. 

(a) If | | is a seminorm on E, show that the set E, of elements xe E with 
|x| = 0 is a subspace. 

(b) Define open balls with respect to a seminorm as with a norm. Show that 
the topology whose base is the family of open balls is Hausdorff if and 
only if the seminorm 1s a norm. 

(c) Let {o,} be a sequence of seminorms on E such that the values a,(x) are 
bounded. Let {a,} be a sequence of positive numbers such that )'a, 
converges. Show that )'a,o, is a seminorm. 

(d) Let {o;};., be a family of seminorms on a vector space E. Let x) € E and 
let i,, ...,i, be a finite number of indices. Let r>0. We call the set of 
all x € E such that 


G;,(X — Xo) <1, k=1,...,n, 


a basic open set. Show that the family of basic open sets is a base 
for a topology on E, which is said to be determined by the family of 
seminorms. 


Let I’ be the set of all sequences a = {a,} of numbers (say, real) such that 
)'|a,| converges. Define 


we’ 


ja] = Do lanl: 


Show that this is a norm on /', and that /' is complete under this norm. 
Let B = {b,} be a fixed sequence in J’. Show that the set of all wel’ 
such that |a,| <|b,| is compact. Show that the unit sphere in /' is not 
compact. 


(b 


—* 


. Let « be a real number, 0 <a <1. A real valued function f on (0, 1] is said 


to satisfy a Hélder condition of order « if there is a constant C such that for 
all x, y we have 


| f(x) — f(y)| = Clx — yl’. 
For such a function, define 


fe) — FON 


fll. = sup |f(x)| + sup , 
x x,y Ix — y| 
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(a) Show that the set of functions satisfying such a Holder condition is a 
vector space, and that || ||, is a norm on this space. 
(b) Show that the set of functions f with || /f||, <1 is a compact subset of 


C({0, 1). 


5. Metric spaces. (a) Let X be a metric space with distance function d. Define 
d'(x, y) = min{1, d(x, y)}. Show that d’ is a distance function, and that the 
notion of convergence and limit with respect to d’ is the same as with 
respect to d. 

(b) As in normed vector spaces, one can define Cauchy sequences, i.e. se- 
quences {x,} such that given ¢, there exists N such that for all m,n=WN 
we have d(x,,X,,) < & A metric space is called complete if every Cauchy 
sequence converges. Show that if a metric space X as in part (a) is 
complete with respect to d, then it is complete with respect to d’. 

(c) For each x € X define the function f, on X by 


f(y) = d(x, y). 


Let || || be the sup norm. Show that 


Let a be a fixed element of X and let g, = f. —f,. Show that the map 
xt+g, is a distance-preserving embedding of X into the normed vector 
space of bounded functions on X. (If the metric is bounded, you can use 
f, instead of g,). Thus one need not fuss too much with abstract metric 
spaces. Besides, almost all metric spaces which occur naturally are in fact 
given as subsets of normed vector spaces. 

A topological space is said to be metrizable if there exists a metric 
such that the open balls form a basis for the topology. Such a metric is 
said to be compatible with the topology. 


6. Let A be a subset of a metric space X. For each xe X, let 
d(x, A) = inf d(x, y) 
for all ye A. Show that the map 
xt d(x, A) 


is a continuous function on X, and that d(x, A) = 0 if and only if x lies in the 
closure of A. We call d(x, A) the distance from x to A. 


7. (a) Show that a metrizable space is normal. [Hint: Let A, B be disjoint 
closed subsets. Let U be the set of x such that d(x, A) < d(x, B) and let V 
be the set of x such that d(x, B) < d(x, A).] 
(b) If A, B are disjoint closed subsets of a metric space, show that the 
function 


xt-+ d(x, A)/(d(x, A) + d(x, B)) 


can be used to prove Urysohn’s lemma. 
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8. 
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Let X be a topological space and E a normed vector space. Let M(X, E) be 

the set of all maps of X into E and C(X, E) the space of all continuous maps 

of X into E. Let B(X, E) be the space of all bounded maps, and BC(X, E) 

the space of bounded continuous maps. 

(a) Show that BC(X, E) is closed in B(X, E). 

(b) Suppose that E is complete, i.e. a Banach space. Show that B(X, E) is 
complete, with the sup norm. 

(c) If X is compact, show that C(X, E) = BC(X, E). 


. Uniform convergence on compact sets. Let X be a Hausdorff space. Let 


M(X, E) be the space of maps of X into a Banach space E. A sequence { f,} 
in this space is said to be uniformly Cauchy on compact subsets if given a 
compact set K and ¢ > 0, there exists N such that for m, n 2 N, we have 


fn — Smilax < € 


where || ||, is the sup norm on K. In other words, the sequence restricted to 
K is uniformly Cauchy. The sequence is said to be uniformly convergent on 
compact sets if there is some map f having the following property. Given a 
compact set K and ¢, there exists N such that for n 2 N, we have 


IIIn — Fix < & 


In other words, the sequence restricted to K is uniformly convergent. We 
shall now make M(X, E) into a metric space for which the above convergence 
is the same as convergence with respect to this metric, in certain cases. 

A sequence {K;} of compact subsets of X said to be exhaustive if their 
union is equal to X, and if every compact subset of X is contained in some 
K,. We assume that there exists such a sequence {K;}. 

(a) Define 


d(f) = y 27 min(1, [l/'llx,. 


If f is unbounded on K, then we set ||f||, = 00 and min(1, ||f||,) =1. 
Show that d(f) satisfies two of the properties of a norm, namely: 


d(f)=0 if and onlyif f=0; 
d(f+g)sd(f) + d(g). 


(b) Define d(f, g) by d(f — g). Show that d(f, g) is a metric on M(X, E). 
(c) Show that 


2" inf(1, fx) S4(f) and = d(f) S\fllx, + 2™. 


(d) Show that a sequence {f,} converges uniformly on compact sets if and 
only if it converges in the above metric. 

(ec) Let K be a compact set and e>0. Given f, let V(f, K, &) be the set of 
all maps g such that ||f —gll,<«. Show that V(f, K,¢&) is open in the 
topology defined by the metric. Show that the family of all such open 
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10. 


11. 


12. 


13. 


14. 


sets for all choices of f, K, ¢ is base for the topology. This proves that 
the topology does not depend on the choice of exhaustive sequence {K;}. 
(f) If E is complete, i. a Banach space, show that M(X, E) is complete in 
the metric defined above. 
(g) If X is locally compact, show that the space of continuous maps C(X, E) 
is Closed in M(X, E) for the metric. 


Let U be the open unit disc in the plane. Show that there is an exhaustive 
sequence of compact subsets of U. 


Let U be a connected open set in the plane (or in Euclidean space R*). Show 
that there is an exhaustive sequence of compact subsets of U. 


Let U be an open subset of a normed vector space. Show that U is con- 
nected if and only if U is arcwise connected. 


The diagonal A in a product X x X is the set of all points (x, x). 

(a) Show that a space X is Hausdorff if and only if the diagonal is closed in 
X x X. 

(b) Show that a product of Hausdorff spaces is Hausdorff. 


If A is a subspace of a space X, we define the boundary of A (denoted by 0A) 

to be the set of all x such that any open neighborhood U of x contains a 

point of A and a point not in A. In other words, 0A = A A(@A). 

(a) Show that 0(A U B) < 0A U OB. 

(b) Show that 0(A 1B) < OA UOB. 

(c) Let X, Y be topological spaces, and let A be a subset of X, B a subset of 
Y. Show that 


0(A x B) =(GA x B)U(A x OB). 


(d) Let A be a subset of a complete normed vector space E. Let xe A and 
let y be in the complement of A. Show that there exists a point on the 
line segment between x and y which lies on the boundary of A. (The line 
segment consists of all points x + t(y — x) with O <t <1.) 


Separable Spaces 


15. 


16. 


17. 


18. 


A topological space having a countable base for its open sets is called separa- 
ble. Show that a separable space has a countable dense subset. 


(a) If X is a metric space and has a countable dense subset, then X is 
separable. 
(b) A compact metric space is separable. 


(a) Every open covering of a separable space has a countable subcovering. 
(b) A disjoint collection of open sets in a separable space is countable. 
(c) A base for the open sets of a separable space contains a countable base. 


A denumerable product of separable (resp. metric) spaces is separable (resp. 
metric). 
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19. 


20. 


21. 
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Let X be separable. Show that the following conditions are equivalent: 

(a) X is compact. 

(b) Every sequence {x,} in X has at least one point of accumulation, that is 
X is sequentially compact. 

(c) Every decreasing sequence {A,} of non-empty closed sets has a nonempty 
intersection. 


Prove that a normal separable space X is metrizable (Urysohn metrization 
theorem). [Hint: Let {U,} be a countable base for the topology. Let (U,,, Un,) 
be an enumeration of all pairs of elements in this base such that U,, < U,,,. 
For each i let f; be a continuous function satisfying 0 < f; < 1 and such that 
f, is 0 on U,, and 1 on the complement of U,,,. Let 


o 1 
d(x, y) = X lL) — HO 


Show that d is a metric and that the identity mapping is continuous with 
respect to the given topology on X and the topology obtained from the 
metric. You will use the fact that given xe X and some open set U,, in the 
base containing x, there exists another set U,, in the base such that 


xeU,cwu,cU,,. 


Regular spaces. A topological space X is called regular if it is Hausdorff, and 
if given a point x and a closed set A not containing x, there exist disjoint 
open sets U, V such that xe U and Ac V. 
(a) A subspace of a regular space is regular. 
(b) Let X be a topological space. If every point has a closed neighborhood 
which is regular, then X is regular. 
(c) Every locally compact Hausdorff space is regular. 
(d) If X is separable regular, show that every point x has a sequence of open 
neighborhoods such that: 
(i) Uns c U,,; 
(ii) {x} = () U,. 


The following exercises are of somewhat less general interest than the preced- 


ing ones (but some are more amusing). 


22. 


Proper maps. Let X, Y be topological spaces and f: X ~ Y a map. We say 
that f is closed if f maps closed sets into closed sets. We say that f is proper 
if f is continuous and if for every topological space Z the map 


fxl,=f7:;X x ZOYxZ 


given by f;(x, z) = ( f(x), z) is closed. 

(a) Show that a proper map 1s closed. 

(b) For each i = 1, ...,n let f;: X¥;-> ¥ be a continuous map. Assume that X; 
is not empty for each i. Let f:[]X;—]|]¥% be the product map. Show 
that f is proper if and only if all f; are proper. 

(c) If f: X > Y is proper and A is closed in X, show that f|A is proper. 
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23. 


24. 


25. 


26. 


27. 


28. 


Let f: X — X’ and g: X' > X” be continuous maps. Prove: 
(a) If f and g are proper, so is g° f. 

(b) If g° f is proper and f is surjective, then g is proper. 
(c) If go f is proper and g is injective, then f is proper. 
(d) If go f is proper and X’is Hausdorff, then f is proper. 


Let X be a topological space, {p} a set consisting of one element p. The map 
f: X > {p} is proper if and only if X is compact. [Hint: Assume that f is 
proper. To show that X is compact, let {S,} be a family of non-empty closed 
sets having the finite intersection property. Let Y = X U{p}, where p is dis- 
joint from X. Define a base for a topology of Y by letting a set be in this 
base if it is of type S,U {p}, or if it is an arbitrary subset of X. Show this is 
a base. The projection 2: X x YY is a closed map. Let D be the subset of 
X x Y consisting of all pairs (x,x) with xe X. Then x(D) is closed and 
therefore contains p. Hence there exists xe X such that (x, p)¢ D, whence 
give an open U in X containing x, and any S,, the set U x (S,U {p}) inter- 
sects D, whence U intersects S,, and x lies in (\S,.] 


Let f: X — Y be a continuous map. Show that the following properties are 
equivalent: 

(a) f is proper. 

(b) f is closed and for each ye Y the set f~'(y) is compact. 


Let f:X —Y be proper. If B is a compact subset of Y, then f~'(B) is 
compact. 


(The marriage problem so baptized by Hermann Weyl.) Let B be a set of boys, 
and assume that each boy b knows a finite set of girls G,. The problem is to 
marry each boy to a girl of his acquaintance, injectively. A necessary condi- 
tion is that each set of n boys know collectively at least n girls. Prove that 
this condition is sufficient. [Hint: First assume that B is finite, and use 
induction. Let n> 1. If for all 1 < k <n each set of k boys knows > k girls, 
marry off one boy and refer the others to the induction hypothesis. If for 
some k with 1 < k <n there exists a subset of k boys knowing exactly k girls, 
marry them by induction. The remaining n—k boys satisfy the induction 
hypothesis with respect to the remaining girls (obvious!) and thus the case of 
finite B is settled. For the infinite case, which is really the relevant problem 
here, take the Cartesian product [[G, over all be B, each G, being finite, 
discrete, and use Tychonoff’s theorem. For this elegant proof, cf. Halmos 
and Vaughn, Amer. J. Math. January 1950, pp. 214—215.] 


The Cantor set. Let K be the subset of [0,1] consisting of all numbers 
having a trecimal expansion 


Ms 


a, 
3" 


n=1 
where a, =0 or a, = 2. This set is called the Cantor set. Show that K is 
compact. Show that the complement of K consists of a denumerable union 
of intervals, and that the sum of the lengths of these intervals is 1. Show that 
the connected component of each point in K is the point itself. (One says 
that K is totally disconnected.) 
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29. 


30. 


31. 


32. 
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[It can be shown that a compact metric space is always a continuous 
image of a Cantor set, and also that a totally disconnected compact metric 
space is homeomorphic to a Cantor set. Cf. books on general topology. 
The Cantor set has measure 0, is not countable, and is a rich source for 
counterexamples. | 


Peano curve. Let K be the Cantor set of the preceding exercise. Let S = 
[0, 1] x [0,1] be the unit square. Let f: K +S be the map which to each 
element )'a,/3" of the Cantor set assigns the pair of numbers 


Dans bo, 
(x “ye 2). 


where b,, = a,,/2. Show that f is well defined. Show that f is surjective and 
continuous. One can then extend f to a continuous map of the interval onto 
the square. This is called a Peano curve. Note that the interval has dimen- 
sion 1 whereas its image under the continuous map f has dimension 2. This 
caused quite a sensation at the end of the nineteenth century when it was 
discovered by Peano. 


The semi parallelogram law (Bruhat-Tits). Let X be a complete metric space. 
We say that X satisfies the semi parallelogram law, or is seminegative, if 
given two points x,, x, € X there is a point z such that for all x e X we have 


d(x1,X_)* + 4d(x, z)? < 2d(x, x,)? + 2d(x, x)’. 


Prove that under this law, d(z, x,) = d(x,,x,)/2, and z is uniquely deter- 
mined. We call z the midpoint of x,, x». 


(Serre, after Bruhat-Tits) Let X be a seminegative complete metric space. Let 
S be a bounded subset of X. Show that there exists a unique closed ball 
B,(x,) of minimal radius containing S. [Use the semiparallelogram law both 
for uniqueness and existence. For existence, show that if {B, (x,)} is a se- 
quence of closed balls containing S with limr,=r (the inf of all radii of 
closed balls containing S), then {x,} is Cauchy.] The center of that closed 
ball is called the circumcenter of S. 


(Bruhat-Tits fixed point theorem) Let X be a complete seminegative metric 
space. Let G be a group of isometries of X, ie. bijective maps f: X ~ X 
which preserve distance. Denote the action of G by (g, x) + g.x. Suppose G 
has a bounded orbit (i.e. there is a point x such that the set S of all elements 
g.x, g €G, is bounded). Then G has a fixed point (the circumcenter) of the 
orbit. 

For the above exercises, cf. Bruhat-Tits, Groupes Réductifs sur un Corps 
Local I, Pub. IHES 41 (1972) pp. 5-251; and K. Brown, Buildings, Springer 
Verlag, 1989, Chapter VI, Theorem 2 of §5. 


CHAPTER III 


Continuous Functions 
on Compact Sets 


lll, §1. THE STONE-—WEIERSTRASS THEOREM 


Let E be a normed vector space (over the real or the complex numbers). 
We can define the notion of Cauchy sequence in E as we did for real 
sequences, and also the notion of convergent sequence (having a limit). If 
every Cauchy sequence converges, then E is said to be complete, and is 
also called a Banach space. A closed subspace of a Banach space is 
complete, hence it is also a Banach space. 


Examples. Let S be a non-empty set, and let F be a normed vector 
space. We denote by B(S, F) the space of bounded maps from S into F. 
It is a normed vector space under the sup norm, and if F is a Banach 
space, then B(S, F) is complete, and thus is also a Banach space. The 
proof that B(S, F) is complete if F is complete should be carried out as 
an exercise. (The reader should have had a similar proof as part of a 
course in advanced calculus but, at any rate, has had it for functions 
which are real valued. The proof applies as well to Banach spaces.) If S 
is a subset of a normed vector space (or a metric space) we denote by 
C(S, F) the space of continuous maps of S into F, and by BC(S, F) 
the subspace of bounded continuous maps. Then BC(S, F) is closed in 
B(S, F), this being nothing else but a special case of the assertion that 
a uniform limit of continuous maps is continuous. Again, the reader 
should have seen a proof in the case of functions, and that same proof (a 
3e-proof) applies to the case of maps into Banach spaces. (Do Exercise 0 
if you have never done it before, or look up Undergraduate Analysis.) 

Let X be a set. By an algebra A of functions on X (say, real valued) 
we mean a subset of the ring of all functions having the properties that if 
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f, 9 € A, then f +g and fg are in A, and if ce R, then cf € A. Most of 
the algebras we deal with also contain the constant functions (identified 
with R itself). We make a similar definition of an algebra over C. 

For example, a polynomials in one variable form an algebra, and so 
do polynomials in several variables. If @ is a function on some set S, 
then the set of all functions which can be written in the form 


Ag taypt:' +a," 


with a;eéR form an algebra, said to be generated by o. Similarly, we 
have the notion of an algebra generated by a finite number of functions 
Qi, ---,Q@,, or by a family of functions. It is the algebra of polynomials 
in @~,, ..-,9,. If X is a topological space, the set of all continuous 
functions is an algebra, denoted by C(X). If we wish to specify the range 
of values (real or complex), we write C(X,C) or C(X,R). Recall that a 
function is a mapping with values in R or C. 

Let S be a compact set. Let A be an algebra of continuous functions 
on §. Every function in A is bounded because S is compact, and conse- 
quently we have the sup norm on A, namely for f € A, 


I fll = sup FOI. 


Thus A is contained in the normed vector space of all bounded functions 
on S. We are interested in determining the closure of A. Since C(S) is 
closed, the closure of A will be contained in C(S). We shall find condi- 
tions under which it is equal to C(S). In other words, we shall find 
conditions under which every continuous function on S can be uniformly 
approximated by elements of A. 

We shall say that A separates points of S if given points x, yeS, 
and x # y, there exists a function f € A such that f(x) # f(y). The ordi- 
nary algebra of polynomial functions obviously separates points, since the 
function f(x) = x already does so. 


Theorem 1.1 (Stone—Weierstrass Theorem). Let S be a compact set, 
and let A be an algebra of real valued continuous functions on S. 
Assume that A separates points and contains the constant functions. 
Then the uniform closure of A is equal to the algebra of all real 
continuous functions on S. 


We shall first prove the theorem under an extra assumption. We shall 
get rid of the extra assumption afterwards. 


Lemma 1.2. In addition to the hypotheses of the theorem, assume also 
that if f, gé A then max(f, g)e A, and min(f,g)¢ A. Then the conclu- 
sion of the theorem holds. 
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Proof. We give the proof in three steps. First, we prove that given x,, 
x,é€S and x, #x,, and given real numbers a, f, there exists he A such 
that h(x,) =a and h(x,) = B. By hypothesis, there exists @ € A such that 


P(X) # (x2). Let 


P(x) — p(%;) 


ne) = +B OE o,) 


Then h satisfies our requirements. 


Next we are given a continuous function f on S and also given «. We 
wish to find a function g € A such that 


fly)-—e<gly)<fly) +e 


for all ye S. This will prove what we want. We shall satisfy these 
inequalities one after the other. For each pair of points x, yeS there 
exists a function h,, € A such that 


h, (x)= f(x) and hy ,(y) = f(y). 


If x = y, this is trivial. If x 4 y, this is what we proved in the first step. 
We now fix x for the moment. For each yeéS there exists an open ball 
U, centered at y such that for all ze U, we have 


h,,(z) < f(z) + €. 


This is simply the continuity of f —h,,, at y. The open sets U, cover S, 
and since S is compact, there exists a finite number of points y,, ...,y, 
such that U, ,...,U, already cover S. Let 


a) 


XY? °° 9 Xn 


h,, = min(h 


Then h, lies in A according to the additional hypothesis of the lemma 
(and induction). Furthermore, we have for all ze S: 


h(z) < f(2) + &, 
and h,(x) = f(x), that is (h, — f)(x) = 0. 


Now for each x € S we find an open ball V, centered at x such that, 
by continuity, for all ze V, we have (h,, — f)(z) > —e, or in other words, 


f(z) —é< h,(2). 


By compactness, we can find a finite number of points x,, ...,x,, such 
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that V,, ...,V,, cover S. Finally, let 
g = max(h,,,,...,h,, ). 
Then g lies in A, and we have for all ze S$ 
f(z) -e<g(z)< fi) +s, 


thereby proving the lemma. 

The theorem is an easy consequence of the lemma, and will follow if 
we can prove that whenever f, geé A then max(f,g) and min(f, g) lie in 
the closure of A. To prove this, we note first that we can write 


max(f,g) = 249 49! 
min(f, g) = 249 at 


Consequently it will suffice to prove that if fe A then |f]| € A. 
Since f is bounded, there exists a number c > 0 such that 


—cSf(x)Sc 


for all x e S. The absolute value function can be uniformly approximated 
by ordinary polynomials on the interval [—c,c] by Exercises 6, 7, or 8, 
which are very simple ad hoc proof. Given ¢, let P be a polynomial such 
that 

|P() —|t\|<e 


for —cSt<c. Then 


|P(f(x)) — |F@)I| <¢, 
and hence | f| can be approximated by Po f. Explicitly, if 


P(t) =a,t" +: + dp, 
then 
Pof=a,f"+-+'*+ 4d, 
1.€. 
P( f(x) = a, f(x)" + +++ + do. 


This concludes the proof of the Stone—Weierstrass theorem. 
Corollary 1.3. Let S be a compact set in R*. Any real continuous 


function on S can be uniformly approximated by polynomial functions in 
k variables. 
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Proof. The set of polynomials contains the constants, and obviously 
separates points of R* since the coordinate functions x,, ...,x, already do 
this. So the theorem applies. 


There is a complex version of the Weierstrass—Stone theorem. Let A 
be an algebra of complex valued functions on the set S. If fe A, we 
have its complex conjugate f defined by 


Ff (x) = f (>). 


For instance, if f(x) = e* then f(x) =e *. If A is an algebra over C of 
complex valued functions, we say that A is self conjugate if whenever 
feéA the conjugate function f is also in A. 


Theorem 1.4 (Complex S-W Theorem). Let S be a compact set and 
A an algebra (over C) of complex valued continuous functions on S. 
Assume that A separates points, contains the constants, and is self con- 
jugate. Then the uniform closure of A is equal to the algebra of all 
complex valued continuous functions on S. 


Proof. Let Ap be the set of all functions in A which are real valued. 
We contend that Ap is an algebra over R which satisfies the hypotheses 
of the preceding theorem. It is obviously an algebra over R. If x, # x, 
are points of S, there exists fe A such that f(x,)=0 and f(x,)=1. 
(The proof of the first step of Lemma 1.2 shows this.) Let g=f +f. 
Then g(x,)=0 and g(x,)=2, and g is real valued, so Ap separates 
points. It obviously contains the real constants, and so the real S—W 
theorem applies to it. Given a complex continuous function @ on S, we 
write @ = u+iv, where u, v are real valued. Then u, v are continuous, 
and u, v can be approximated uniformly by elements of Ap, say f, gE Ar 
such that lu—f||<e and |lv—gl_<e Then f+ig approximates 
u + iv = q, thereby concluding the proof. 


Remark. The Stone—Weierstrass theorem has a useful application to 
locally compact spaces. For such corollaries, we refer the reader to 
Chapter IX, §6, and Chapter XVI, §3. For explicit approximations in 
concrete cases, see the Exercises and also Chapter VIII, §1. 


lll, §2. IDEALS OF CONTINUOUS FUNCTIONS 


The second theorem of this chapter deals with ideals of continuous func- 
tions. Let S be a topological space, and R a ring of continuous functions 
(real valued) on S. An ideal J of R is a subset of R satisfying the 
following properties: The zero function 0 is in J. If f, ge J, then f+g 
and —f are in J, and if he R, then hf € J. The reader should really have 
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met the definition of an ideal in an algebra course, but we don’t assume 
this here, although some motivation from algebra is useful. 

Let f be continuous on S. A zero of f is a point xeS such that 
f(x) =0. The set of zeros of f is a closed set denoted by Z,. Let J be 
an ideal. Then the set 


Z(J) = in} Z;, 


equal to the intersection of the sets of zeros of all feJ, is closed, and is 
called the set of zeros of J. If J, J’ are two ideals, and J cJ’, then 
Z(J) > Z(J'). We ask to what extent the set of zeros of an ideal deter- 
mines this ideal, and answer this question in an important case. 


Theorem 2.1. Let X be a compact space, and let R be the ring of 
continuous functions on X, with the sup norm. Let J be a closed ideal 
(i.e. an ideal, closed under the sup norm). If f € R is such that f(x) =0 
for all zeros x of J (i.e. if f vanishes on the set of zeros of J), then f 
lies in J. 


Proof. Given «, let U be the subset of X consisting of all xe X such 
that | f(x)|<«. Then U is open, and the complement S of U is closed, 
and hence compact. Note that U contains Z,. For each yeS, we can 
find a function g, in J such that g,(x) #0 in some open neighborhood 
V, of y (by continuity). There is some finite covering {V,,...V, } of S 
corresponding to functions g, ,...,g,,. Let 


1? ee 


9 = 95, +7 + Dy 


Then g is in J, is continuous, is nowhere 0 on S, and 2 0. Since g has a 
minimum on S, there is a number a > 0 such that g(x) 2a for all xeS. 
The function 
ng 
1+ng 


lies in J, because 1+ ng is nowhere 0 on X, its inverse is continuous 
on X, so in R, and hence (1+ g)'ngeJ. For n large, the function 
ng/(1 + ng) tends uniformly to 1 on S, and hence the function 


ng 
Sting 


lies in J, and approximates f within ¢ on S. Since 0 < ng/(1 + ng) <1 it 
follows that on U we have the estimate 


O<|fng/1 + ng)| <¢, 
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and so fng/(1 + ng) lies within 2e of f. Thus we have shown that f lies 
in the closure of J. Since J is assumed closed, we conclude that f lies in 
J, thereby proving our theorem. 


Remark 1. Situations analogous to that of Theorem 2.1 arise fre- 
quently in mathematics. For instance, let R be the ring of polynomials in 
n variables over the complex numbers, R = C[t,,...,t,]. Let J be an 
ideal of R, and define zeros of J to be n-tuples of complex numbers 
x such that f(x)=0 for all feJ. It is shown in algebraic geometry 
courses that if f is a polynomial in R which vanishes on Z(J), then 
f™eJ for some positive integer m. This is called Hilbert’s Nullstellensatz. 


Remark 2. Theorem 2.1 is but an example of a type of theorem which 
describes the topology of a space and describes properties of a space in 
terms of the ring of continuous functions on that space. (Cf. also Exercise 
5.) This is one way in which one can algebraicize the study of certain 
topological spaces. 


lll, §3. ASCOLI’S THEOREM 


In the examples of Chapter XVIII, §4, we shall deal with compact subsets 
of function spaces, and we need a criterion for compactness, which is 
provided by Ascoli’s theorem. It is also used in other places in analysis, 
for instance in a proof of the Riemann mapping theorem in complex 
analysis. Therefore, we give a proof here in the general discussion of 
compact spaces. 

Let X be a subset of a metric space, and let F be a Banach space. Let 
® be a subset of the space of continuous maps C(X, F). We shall say 
that ® is (or its elements are) equicontinuous at a point x, ¢« X if given ¢, 
there exists 6 such that whenever x € X and d(x, x.) < 6, then 


If(x) — fxo)l < € 


for all fe®. We say that ® is equicontinuous on X if it is equicon- 
tinuous at every point of X. 


Theorem 3.1 (Ascoli’s Theorem). Let X be a compact subset of a 
metric space, and let F be a Banach space. Let ® be a subset of the 
space of continuous maps C(X, F) with sup norm. Then ® is relatively 
compact in C(X,F) if and only if the following two conditions are 
satisfied : 


ASC 1. ©® is equicontinuous. 


ASC 2. For each xe X, the set ®(x) consisting of all values f(x) for 
f €® is relatively compact. 
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Proof. Assume that © satisfies the two conditions. We shall prove 
that ® is relatively compact. For this it is sufficient to show that ® can 
be covered by a finite number of balls of prescribed radius (Corollary 3.9 
of Chapter II). Let r>0. By equicontinuity, for each x € X we select an 
open neighborhood V(x) such that if ye V(x), then | f(y) — f(x)| <r for 
all fe@®. Then a finite number V(x,), ...,V(x,) cover X. Each set 


O(x,), .--,P(x,,) 
is relatively compact, and hence so is their union 
Y = O(x,)U'::U O(x,). 


Let B(a,), ...,B(a,,) be open balls of radius r centered at points a,, ...,a,, 
which cover Y. Then f(x,), ...,f(x,) lie in these balls. In fact, for each 
i= 1,...,n we have 


f(x;) E B(a,i) 


where oa: {1,...,n} > {1,...,m} is some mapping. For each such map o 
let ®, be the set of f € ® such that for all i, we have 


| f(x;) — agi] <r. 


Then the finite number of ®, cover ®. It suffices now to prove that each 
®. has diameter < 4r. But if f, ge @®, and xe X, then x hes in some 
V(x;), and then: 


F(x) — 9) S |F(%) — FDI + Fi) — oil + lei — gl%i)| + lg) — 9Od)| 


< 4r. 


This proves our implication, and the part of Ascoli’s theorem which 
is used in the applications. The converse is trivial and left to the reader. 


Ascoli’s theorem is used mostly when F is the real or complex num- 
bers, and in that case, we reformulate it as a corollary. 


Corollary 3.2. Let X be a compact subset of a metric space, and let ® 
be a subset of the space of continuous functions on X with sup norm. 
Then ® is relatively compact if and only if ® is equicontinuous and 
bounded (for the sup norm, of course). 


Proof. For each x eX, our hypothesis that ®(x) is bounded implies 
that ®(x) is relatively compact, since a closed bounded subset of a finite 
dimensional space is compact. So we can apply the theorem. 
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Remark. Since ® has a metric defined by the sup norm, as a rela- 
tively compact set it has the property that any sequence has a convergent 
subsequence, converging in its closure. Sometimes one deals with a lo- 
cally compact set X which is a denumerable union of compact sets. In 
that case, one obtains the following version of Ascoli’s theorem. 


Corollary 3.3. Let X be a metric space whose topology has a countable 
base {U;} such that the closure U; of each U, is compact. Let {f,} be a 
sequence of continuous maps of X into a Banach space. Assume that 
{ f,} is equicontinuous (as a family of maps), and is such that for each 
x eX, the closure of the set {f,(x)} (n=1,2,...) is compact. Then 
there exists a subsequence which converges pointwise to a continuous 
function f, and such that the convergence is uniform on every compact 


subset. 


Proof. We can find a sequence {V;,} of open sets such that V; < V,,,, 
such that V; is compact, and such that the union of the V, is X. For 
each i, by the previous version of Ascoli’s theorem, there exists a sub- 
sequence which converges uniformly on V;. The diagonal sequence with 
respect to all i converges uniformly on every compact set. This proves 
the corollary. 


Remark. In light of Urysohn’s metrization lemma, the hypotheses on 
X in the corollary could be given as X separable locally compact. 


lll, §4. EXERCISES 


0. Let S be a subset of a normed vector space (or a metric space), and let { f,} 
be a sequence of continuous maps of S into a Banach space F. Assume that 
{ f,} is a Cauchy sequence (for the sup norm). Show that {f,} converges to a 
continuous function f (for the sup norm). Show that BC(S, F) is closed in 
B(S, F). 


1. Let X be a compact set and let R be the ring of continuous (real valued) 
functions on X. Let J, J’ be closed ideals of R. Show that J c J’ if and only 
if Z(J) > Z(J’). 


2. Let S be a closed subset of X. Let J be the set of all feR such that f 
vanishes on S. Show that J is a closed ideal. Assume that X is Hausdorff. 
Establish a ring-isomorphism between the factor ring R/J and the ring of 
continuous functions on S. (We assume that you have had the notion of a 
factor ring in an algebra course.) 


3. Let X be a compact space and let J be an ideal of C(X). If the set of zeros 
of J is empty, show that J = C(X). (This result is valid in both the real and 
the complex case.) 


60 


10. 


CONTINUOUS FUNCTIONS ON COMPACT SETS [II], §4] 


. Let X be a compact Hausdorff space. Show that a maximal ideal of C(X) 


has only one zero, and is closed. (Recall that an ideal M is said to be 
maximal if M # C(X), and if there is no ideal J such that Mc JcCC(X) 
other than M and C(X) itself.) Thus if M is maximal, then there exists pe X 
such that M consists of all continuous functions f vanishing at p. 


. Let X be a normal space, and let R be the ring of continuous functions on 


X. Show that the topology on X is the one having the least amount of open 
sets making every function in R continuous. 


. Give a Taylor formula type proof that the absolute value can be approxi- 


mated uniformly by polynomials. First, reduce it to the interval [—1, 1] by 
multiplying the variable by c or c™* as the case may be. Then write |t| = 
/t?. Select 5 small, 0<6 <1. If we can approximate (t? + 6)'/7, then we 
can approximate /t?. Now to get (t? + 6)'/? either use the Taylor series 
approximation for the square root function, or if you don’t like the binomial 
expansion, first approximate 


log(t? + 6)1/? = 4 log(t? + 6) 
by a polynomial P. Then take a sufficiently large number of terms from the 


Taylor formula for the exponential function, say a polynomial Q, and use 
QO o P to solve your problems. 


. Give another proof for the preceding fact, by using the sequence of poly- 


nomials {P,}, starting with P,(t) = 0 and letting 
P,41(t) = P,(t) + 3(t — P,(t)’). 


Show that {P,} tends to /t uniformly on [0, 1], showing by induction that 


0< fi— p(y svt 
t 


2+n,/t 


whence 0 < \/t — P,(t) S 2/n. 


. Look at Example 1 of Chapter VIII, §3 to see another explicit way of 


proving Weierstrass’ approximation theorem for a continuous function on a 
finite closed interval. Do Exercise 1 of that chapter. 


. Let X be a compact set in a normed vector space, and let {f,} be a sequence 


of continuous functions converging pointwise to a continuous function f, and 
such that {f,} is a monotone increasing sequence. Show that the convergence 
is uniform (Dini’s theorem; cf. Chapter IX, §1). 


Let X be a compact metric space (whence separable). Show that the Banach 
space C(X, R) or C(X, C) of continuous functions on X is separable. 

[Hint: Let {x,} be a countable dense set in X and let g, be the function on 
X given by 


n(x) = A(x, Xn), 
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11. 


12. 


13. 


14. 


15. 


16. 


where d is the distance function. Use the Stone-Weierstrass theorem applied 
to the algebra generated by all functions g, to conclude that C(X,R) is 
separable.] Note: Since a compact Hausdorff space is normal, and since a 
normal separable space is metrizable, one can adjust the statement of the 
theorem proved in the exercise as follows: 


Let X be a compact Hausdorff separable space. Then C(X, R) is separable. 


Let X, Y be compact Hausdorff spaces. If f, g are continuous functions on 
X and Y respectively, we denote by f ® g the function such that 


(f © g)(x, y) = f(x)g()). 


Show that every continuous function on X x Y can be uniformly approxi- 
mated by sums )/7_, f;@g; where f, is continuous on X and g; is continuous 
on Y. 


Let X be compact Hausdorff. By an algebra automorphism of C(X) we mean 
a map o: C(X) > C(X) such that o leaves the constants fixed, and satisfies 


o(f +g)=a(f)+ (9),  oa(fg)=a(f)o(9). 


Show that an algebra automorphism is norm preserving, ie. |/af|| = || fl. 


Let X be a compact Hausdorff space and let A be a subalgebra of C(X, R). 
Show that there exists a continuous map g:X — Y of X onto a compact 
space Y such that every element of A can be written in the form go g, where 
g iS a continuous function on Y. 


Let X, Y be compact Hausdorff spaces. Show that X is homeomorphic to Y 
if and only if C(X, C) is algebra-isomorphic to C(Y, C). 


Let X be a compact Hausdorff space. Let -@ be the set of all maximal ideals 
in C(X,C). Define a closed set in -4 to consist of all maximal ideals con- 
taining a given ideal. Show that this defines a topology on .@. For each 
xe X, let M, be the ideal of functions in C(X, C) which vanish at x. Show 
that the map 

xr M, 


is a homeomorphism between X and .%. 


For aeR let f,(x) = ee *’. Prove that any function @ which is C® and has 
compact support on R can be uniformly approximated by elements of the 
space generated by the functions f, over C. [Hint: If w is a function van- 
ishing outside a compact set, and N is a large integer, let wy be the extension 
of wy on [—N,N] to R by periodicity. Use the partial sums of a Fourier 
series to approximate such an extension of (x)e*, and then multiply by 
e-*’] Remark. Instead of e~** you could use any function h(x) > 0 which is 
C”, and tends to 0 at infinity. This would not be the case in Exercises 19 
and 20 below. 


The next four exercises form a connected set. 
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17. 


18. 


20. 


21. 


22. 


23. 
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Let X be compact Hausdorff and let p be a point of X. Let A be a 
subalgebra of C(X,R) consisting of functions g such that g(p)=0. Assume 
that there is no point g#p such that g(q)=0 for all ge A, and that A 
separates the points of X — {p}. The the uniform closure of A is equal to the 
ideal of all functions vanishing at p. 


Let X be locally compact Hausdorff, but not compact. Let C,,(X, R) be the 
algebra of continuous functions f on X such that f vanishes at infinity 
(meaning, given « there exists a compact K such that | f(x)| <« if x ¢ K). Let 
A be a subalgebra of C,,(X,R) which separates points of X. Assume that 
there is no common zero to all functions in A. Show that A is dense in 
C,,(X, R). 


. Let f be a real valued continuous function on R35» (reals = 0). Assume that f 


vanishes at infinity. Show that f can be uniformly approximated by functions 
of the form e *p(x), where p is a polynomial. [Hint: First show that you can 
approximate e ** by e *q(x) for some polynomial q(x), by using Taylor’s 
formula with remainder. If p is a polynomial, approximate e~"*p(x) by e “q(x) 
for some polynomial q. | 


Let f be a continuous function on R, vanishing at infinity. Show that f can 
be uniformly approximated by functions of the form e~*’p(x), where p is a 
polynomial. 


Remark. By changing variables, one can use e™ and e~* with a fixed 
c > 0 instead of e~* and e~*’ in Exercises 19 and 20. 


Let X be a metric space and E a normed vector space. Let BC(X, E) be the 
space of bounded continuous maps of X into E. Let ® be a bounded subset 
of BC(X, E). For xe X, let ev,:®—E be the map such that ev,(g) = (x). 
Show that ev, is a continuous bounded map. Show that ® is equicontinuous 
at a point ae X if and only if the map x- ev, of X into BC(®, E) 1s 
continuous at a. 


Let X be a compact subset of a normed vector space, and E a normed vector 
space. Show that any equicontinuous subset ® of C(X, E) is uniformly equi- 
continuous. [This means: Given ¢, there exists 6 such that |x — y| <6 im- 


plies | f(x) — f(y)| < ¢ for all fe ®.] 


Let X be a subset of a normed vector space and ® an equicontinuous subset 
of BC(X,R). Let Y be the set of points xe X such that ®(x) is bounded. 
Prove that Y is open and closed in X. If X is compact and connected, and if 
for some point ae X the set ®(a) is bounded, show that ® is relatively 
compact in C(X, R). 


PART TWO 


Banach and Hilbert Spaces 


The two chapters of this part are absolutely basic for everything else that 
follows, and introduce the most useful of all the spaces encountered in 
analysis, namely Banach and Hilbert spaces. The reader who wishes to 
study integration theory as soon as possible may continue these chapters 
with Chapter VI, which will make essential use of the basic properties of 
these spaces, especially the completion of a normed vector space and the 
linear extension theorem. Indeed, the integral of the absolute value of a 
function defines a seminorm on a suitable space of functions, whose com- 
pletion will be the main object of study of the chapters on integration. 

On the other hand, readers may look directly at the functional anal- 
ysis, as a continuation of the linear theory of Banach and Hilbert spaces. 
At some point, of course, these come together when we study the spectral 
theorems and the existence of spectral measures. 

As in the algebraic theory of vector spaces, we shall consider continu- 
ous linear maps L: E—F of a normed vector space into another. The 
kernel and image of L are defined as in the algebraic theory, namely the 
kernel is the set of elements xe E such that L(x)=0. The image is 
simply L(E). Both Ker L and Im L are subspaces, of E and F respec- 
tively. However, now that we have the norm, we note that the kernel is 
a closed subspace (being the inverse image of the closed set {0}). Warn- 
ing: the image if not necessarily closed. For conditions under which the 
image is closed, see Chapter XV. 

For the integration theory, we do not need such considerations of 
subspace and factor space. However, we shall consider the dual space in 
the context of integration, showing that various spaces of functions are 
dual to each other. Thus we deal at somewhat greater length with the 
dual space in this chapter. An application of the duality theory in the 
context of Banach algebras will be given in Chapter XVI. 


CHAPTER IV 


Banach Spaces 


IV, §1. DEFINITIONS, THE DUAL SPACE, AND 
THE HAHN-—BANACH THEOREM 


Let E be a Banach space, i.e. a complete normed vector space. One can 
deal with series )’ x, in Banach spaces just as with series of numbers, or 
of functions, and the most frequent test for convergence (in fact absolute 
convergence) is the standard one: 


Let {a,} be a sequence of numbers = 0 such that )\ a, converges. If 
|x»! <a, for all n, then ¥' x, converges. 


The proof is standard and trivial. 


Let E, F be normed vector spaces. We denote by L(E, F) the space 
of continuous linear maps of E into F. It is easily verified that a linear 
map 4: E-F is continuous if and only if there exists C >0 such that 
|A(x)| < C|x| for all xe E. Indeed, if the C exists, continuity is obvious 
(even uniform continuity). Conversely, if 1 is continuous at 0, then there 
exists 6 such that if |x| <0, then |A(x)|< 1. Hence for any non-zero 


| | 


whence we can take C = 2/0. 
Such a number C is called a bound for J, and 4 is also said to be 
bounded. Let S, be the unit sphere in E (centered at the origin), that is 


< 1, 
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the set of all x e E such that |x| = 1. Then a bound for / is immediately 
seen to be the same thing as a bound for the values of 4 on S,. The 
least upper bound of all values |A(x)|, for x € S,, is called the norm of 4, 
and the map 

Are |A| 


is a norm on L(E, F). It is immediately seen that |A| is the greatest lower 
bound of all numbers C > 0 such that 


|A(x)| < C|xl, all xe E. 


Let E, F, G be normed vector spaces, let ue L(E, F), and let v € L(F, G). 
Then vou is in L(E, G) and we have 


|vo ul S |o||ul. 


Proof. A composite of continuous maps is continuous, and a compos- 
ite of linear maps is linear, so our first assertion is clear. As to the 
second, we have 


|v o u(x)| = |v(u(x))| S |v] |u(x)| S Io} [ul |x, 
so the desired inequality follows by definition. 
If F is complete, then L(E, F) is complete. 


This is but an exercise. If {4,} is a Cauchy sequence of elements in 
L(E, F), then for each xe E one verifies that {1,(x)} is a Cauchy se- 
quence in F, and hence converges to an element which we define to be 
A(x). One then verifies that 4 is linear, and that if C = lim|/,|, then C 
is a bound for 4, so that 2 is continuous. Finally one verifies that {/,} 
converges to J in L(E, F). (Fill in the details as Exercise 1, or look them 
up in Undergraduate Analysis.) 


We give some terminology concerning the space L(E, F) which is used 
constantly in this book, and in analysis. 

A continuous (bounded) linear map of a Banach space into itself is 
called an endomorphism, or an operator. 

In the case of two spaces E, F, an element ue L(E, F) is said to be 
invertible if there exists v € L(F, E) such that 


uov=IT1, and vou=I1, 


(where J is the identity mapping). In mathematics, the word isomorphism 
refers to invertibility in various contexts, for instance a map having a 
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continuous inverse, a linear inverse, a differentiable inverse, etc. ad lib. 
Thus in each case, one should add an adjective to the word isomorphism 
to make precise the kind of invertibility which is meant. In our present 
case, we shall call invertible elements of L(E, F) toplinear isomorphisms, 
the adjective toplinear referring to the topology and the linearity. The 
set of toplinear isomorphisms of E onto F is denoted by Lis(E, F). If 
E =F, then we call toplinear isomorphisms of E with itself toplinear 
automorphisms of E; the set of such automorphisms is denoted by 
Laut(E). (For euphony, the reader may prefer the adjective topolinear 
instead of toplinear.) 

A toplinear isomorphism u between Banach spaces E, F which also 
preserves the norm (that is |u(x)| =|x| for all xe £E) will be called a 
Banach isomorphism, or an isometry. 

We shall also be dealing with bilinear maps. Let E, F, G be normed 
vector spaces. A map 


o: Ex F->G 


is said to be bilinear if for each x € E the map y+ (x, y) is linear, and if 
for each ye F the map xt (x, y) is linear. Such bilinear maps form a 
vector space. It is easily verified (in a manner similar to the case of 
linear maps) that is continuous if and only if there exists C such that 


| p(x, y)| S Cl|x|ly| 


for all xe E, ye F. The greatest lower bound of such C then defines a 
norm on the space of continuous bilinear maps, denoted by L(E, F; G), 
and this space is a Banach space if G is complete. (Cf. Exercise 3.) 

In the differential calculus, and other applications, we need an 
isomorphism between L(E, L(F,G)) and L(E,F;G) as follows. Let 
he L(E, L(F, G)) and define ¢, by 


py(x, y) = A(x)(y). 


Then @, is obviously bilinear, and we have 


lorax, y)| SAO) yl S IAT xi ly! 
so that 


lpal S|Al. 


On the other hand, given g € L(E, F; G), we can define 4, by 


A(x)(y) = P(x, Y). 
Then 


[Ag(x)(y)I S lel lily! 
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so that by definition, 


[A(x)| S |p| |x|. 
Hence 


Aol S lel. 
Thus we get a Banach isomorphism 
L(E, L(F, G))< L(E, F; G). 
As one example of a bilinear map, we have 
L(E, F)x E>F 


such that (A, x)» A(x). This bilinear map has norm 1. 
Similarly, we can treat multilinear maps. If E,, ...,E,, F are normed 
vector spaces, a multilinear map 


o: E, > ann x E,7F 


is a map which is linear in each variable. Such a map is continuous if 
and only if there exists C such that for all x;e E; we have 


|P(X4, soe Xn)| s C|x, | |x| a |x,|- 
We have a norm-preserving isomorphism 
L(E,, L(Ep, ...,L(E,; F)...)) L(Ey, ... En; F) 


from the space of repeated continuous linear maps to the space of con- 
tinuous multilinear maps exactly as in the bilinear case. If F is complete, 
then all these spaces are also complete. 

We now consider a specially important space of linear maps. 

The normed vector space L(E, R) [or L(E, C) in the complex case] 1s 
called the dual space of E, and is denoted by E’. Elements of E’ are 
called functionals on E. Functionals can be used as substitutes for coor- 
dinates. Indeed, suppose that E = R“, and let A; be the i-th coordinate 
function, that is 


A(X 15 ++ +5Xn) = Xj. 


Then it is easily verified that {A,,...,A,} is a basis for the dual space of 
R*. Furthermore, the values of 4,, ...,4, on an element x € R* character- 
ize this element. Although we do not have such convenient bases in the 
infinite dimensional case, we still have such a characterization of elements 
of E in terms of the values of functionals. This is based on the following 
theorem. 
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Theorem 1.1. Let E be a real normed vector space, and let F be a 
subspace. Let 4: F +R be a functional, bounded by a number C > 0. 
Then there exists an extension of 4 to a functional of E, having the 
same bound. 


Proof. Changing the norm on E (multiplying it by a number) we see 
that it suffices to prove our theorem when C= 1. We first prove that if 
véE and v¢ F, then we can extend 4 to F + Ru, and preserve the bound 
1. Every element of F + Rv has a unique expression as x + tv with xe F 
and te R. Let ae R. The map 4* on F + Rv such that 


A*(x + tv) = A(x) + ta 


is certainly linear. We must show that we can select a such that A* is 
bounded by 1. Dividing both sides by t (if t 40), we see that it suffices 
to find a number a such that 


|A(y) + al Sly + o| 
for all ye F, or equivalently that for all ye F, 
My)+asl|y+v| and —Aly)—asly +0. 
This determines inequalities for a, namely 
—A(y)—|ly +v] Sas —Ay) + ly + ov, 


and it suffices to show that the set of real a satisfying such inequalities is 
not empty. But for all y, ze F we have 


|A(y) — A(z) = |A(y — 2)| Sly — 2! 
so that | 
—A(z) —|z+ vo] S$ —A(y) + ly + vl. 


From this we conclude that there is a non-empty interval of values of a 
which satisfy our requirements. 

We now use Zorn’s lemma. We consider the set of pairs (G, 4*) where 
G is a subspace of E containing F, and 4* is a functional on G having 
the same bound as /, and extending 4. We order such pairs 


(Gy, A1) S (Gy, 42) 


if G, is a subspace of G, and 4, is an extension of 4,. This is an 
ordering, and our set of pairs is inductively ordered. The proof of this is 
the usual proof: Given a totally ordered set of pairs as above, say 
{(G;, 4;)}, we let G be the union of all G;. We can define a functional 1* 
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on G extending all 4;: Any xéG is in some G,, and we define A*(x) = 
A(x). This is independent of the choice of i such that x € G,, and the pair 
(G, A*) is an upper bound for our family. By Zorn’s lemma, let (G, A*) be 
a maximal element. Then G=E, for otherwise, there is some ve E, 
v¢G, and we can use the first part of the proof to get a bigger pair. 
This proves our theorem. 


Corollary 1.2. Let E be a normed vector space, and ve E, v #0. Then 
there exists a functional 4 on E such that A(v) 4 0. 


Proof. Let F be the one-dimensional space generated by v. We define 
4 on F taking any non-zero value on v, and extend 4 to E using 
Theorem 1.1. 


Theorem 1.1, or its Corollary, is referred to as the Hahn—Banach 
theorem. We have formulated it over the reals, but it is also valid for 
complex Banach spaces, and the complex case is easily reduced to the 
real case. Indeed, given a complex functional 2 on a complex subspace 
F, let ~ be its real part. Let g’ be a real extension of g to E, and define 


A'(v) = eo’ (v) — ig’ (iv) forve E. 


You can verify as Exercise 2 that 1’ is a desired complex extension of J. 

The dual space E’ is a special case of the space of linear maps L(E, F) 
when F is the space of scalars. As such, we have seen that it is a Banach 
space with its natural norm. Furthermore, we can form the double dual 
E” in a similar fashion, and E” is also a Banach space. Note that each 
element x € E gives rise to a functional f, € E”, given by 


f.. E’ > scalars R or C such that FAA) = A(X), 
continuous for the topology defined by the norm on E’. 


Proposition 1.3. The map xt f, is an injective linear map of E into E”, 
which is norm preserving, 1.e. |x| = | f,|. 


Proof. Suppose x, ye E and x #y. Then x—y+#0. By the Hahn- 
Banach theorem, there exists 4 € E’ such that A(x — y) £0, so A(x) 4 Ay). 
This proves that f, # f,, whence the map xt f, is injective. The inequality 


[A(x)| S Al |x| 
shows that |f,| << |x|. We leave to the reader the opposite inequality 


|x| <|f,|, which concludes the proof that we have an isometric em- 
bedding of E in E”. 
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In Chapter II, §1 we defined the weak topology on a space, deter- 
mined by a set of mappings into a topological space. We now apply this 
notion to the dual space. We let ¥ be the family of functions on E’ 
given by 


F = (fibxek 


as above. The weak topology on E’ determined by this family ¥ is 
called simply the weak topology on E’. The next theorem gives one of its 
most important properties. 


Theorem 1.4 (Alaoglu’s Theorem). Let E be a Banach space, and let E', 
be the unit ball in the dual space E'. Then E', is compact for the weak 
topology. 


Proof. For each x € E, let K,, be the closed disc of radius 1 in C. Let 


K=[T] K, 


xeE 
|x}<1 


be the Cartesian product of all closed discs of radius 1, taken over all 
xé€E satisfying |x| <1. We give K the product topology, so that by 
Tychonoff’s theorem, K is compact. We map E; into K by the map 


f:E, 2K suchthat Ar [] Ax= [] LA 


|x|S1 |x|s1 


Immediately from the definition, one sees that the map f is injective, and 
thus gives an embedding of E’, into the product space. Furthermore, also 
from the definition of the weak topology defined in Chapter II, §1, we 
observe that the weak topology determined by the family ¥ is the same 
as the weak topology determined by the family ¥, of functionals f, with 
xéE, (the closed unit ball in E), because any xe E, x £0 is a scalar 
multiple of a unit vector. More precisely, we also have an imbedding 


f:E oO JIT C,  givenby Ar] A(x), 


xeE xeE 


and the following diagram is commutative: 


E’ Cl, I] Cc. 


xeE 
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The product topology induced on [| K,. is the same as the topology 
induced by viewing this product as a subspace of [[C,. Therefore, it 
follows that the weak topology on E’‘, is the topology induced by viewing 
E’, as a subspace of K via the embedding f, or also as a subspace of 
[| C.(« © B), via the embedding of E’ in [| C,. To show that EF; is 
compact, it suffices therefore to show that f(E’,) is closed in K. 

To do this, we first prove that E’ is closed in []C,(xeE£). Let 
| | yx) (x € E) be an element of the product which lies in the closure of 
f(E’). Given elements x, ye E, we have to show that xt>y(x) is a 
bounded functional. By definition of the weak topology, given there 
exists 1 € E’ such that 


|A(x) — y(x)| < ¢, 
|A(y) — y(y)| <6 
|A(x + y)— p(x + y)| <e 


But A(x + y) = A(x) + A(y), whence |y(x + y) — y(x) — y(y)| < 38, so 


y(x + y) = v(x) + y()). 


Similarly, one sees that y(cx) = cy(x) for ce C, whence y is linear. Also 
similarly, one sees that y is bounded. Furthermore, if |] y(x) lies in 
the closure of E',, then the above 4 can be chosen such that |A| < 1, 
that is |A(x)| < |x|. Then by a similar epsilson argument, one sees that 
ly(x)| < |x|, which proves that f(E’,) is closed, whence compact, thus 
concluding the proof of Theorem 1.4. 


Remark. In the case of Hilbert space, to be defined in the next 
chapter, the Banach space E is self dual, and so in this case, one may 
state that the unit ball in Hilbert space is compact in the weak topology. 


IV, §2. BANACH ALGEBRAS 


An algebra (say over R) is a vector space A, together with a mapping 
Ax A-A (called a multiplication) which is bilinear. This means that 
for all u, v, we A and cE R we have 


u(v + w) = uv + uw, (u + v)w = uw + ow, 
c(uv) = (cu)v = u(cv). 


If in addition we have uv = vu, we say that the algebra is commutative. 
If u(vw) = (uv)w, we say that the algebra is associative. If there exists an 
element e¢ A such that eu = ue = u, we say that the algebra has a unit 
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element e, which is then uniquely determined, because if e’ is another unit 
element, then 


A normed algebra is an associative algebra whose vector space is 
normed, and whose norm satisfies the condition |uv| < |u||v|. A normed 
algebra which is complete is called a Banach algebra. 

For convenience when there is a unit element, we shall also assume 
that |e| = 1. See Exercise 5 which shows that this condition can always 
be achieved by a simple redefinition of the norm. 


Example 1. Let A be the vector space of bounded functions on a set, 
multiplication being ordinary multiplication of functions. Then A is a 
Banach algebra. So is the set of bounded continuous functions. 


Example 2. Let 4 =R?° and let the product be the cross product. 
Then A is neither commutative nor associative, but otherwise satisfies 
the other axioms of a normed algebra. Since non-associative algebras 
occur so rarely in what we do, we have taken associativity into the 
definition of a normed algebra, so that the present example is not that of 
a normed algebra in our sense. 


Example 3. Let E be a normed vector space. Then L(E, E) 1s an 
algebra, if we define the multiplication to be composition of mappings. 
In other words, if u, ve L(E, E), then the product uo v is again a contin- 
uous linear map of E into itself, and we have associativity and bilinearity, 
which follow at once from the definition of the sum of two linear maps. 
Furthermore, L(E, E) has a unit element I which is the identity mapping. 
We often write uv instead of uov. Elements of L(E, E) are also called 
endomorphisms of E, or operators on E, and we abbreviate L(E, E) by 
End(E). If E is complete, i.e. a Banach space, then from remarks made in 
§1, we conclude that End(E) is a Banach algebra. Of course, End(E) 1s 
not necessarily commutative. It is the most important algebra studied in 
this book. If E is finite dimensional, this algebra is essentially the alge- 
bra of n x n matrices, where n = dim E. 


Example 4. Let E be the vector space of continuous functions on R, 
periodic of period 2x, with the sup norm. Then E is a Banach space. If 
f, 9 © E, we define a product called the convolution product by 


1 (* 
f * g(x) = an { f(t)g(x — t) dt. 


It follows easily from elementary integrations that E is then a commuta- 
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tive, associative Banach algebra. Note that E does not have a unit 
element. In this direction, see Chapter III, §1. 


We observe that an algebra with a unit element contains a replica of 
the scalars, under the map 


cr ce, 


which is injective, and preserves addition and multiplication. In the case 
of L(E, E), an element cI (I = Identity) is simply “multiplication by c.” 

Let A be an associative algebra with unit element e. An element u of 
A is said to be invertible if there exists v¢ A such that uv = vu =e. The 
element v is uniquely determined by u, because if uw = wu =e, then 
multiplying on the left by v shows that w= vuw=v. We call this ele- 
ment the inverse of u and denote it by u~*. An invertible element is also 
called a unit. If u, v are invertible, then so is uv, because 


(uv) } =v tut. 


Theorem 2.1. Let A be a Banach algebra with unit element e. Then the 
set of invertible elements is open in A. If vE A and |v| <1, thene+v 
is invertible. 


Proof. Let |v| <1. Then the series e +v+v7+--: converges (abso- 
lutely) and since 


(e—v)(et+vu+-7+0")=e—v"", 


it follows that e—v is invertible, and that its inverse is the limit of 
etuvu+:::+v" as n>o. That we have —v instead of v makes no 
difference, since |—v| = |v|. Suppose now that u is invertible, and let 


lw — ul < 1/ju"|. 


Then 

wu! — e| = |(w—u)u4| < |w—ul lu] < 1. 
Hence wu! is invertible, whence w is invertible, thus proving our 
theorem. 


We observe that the map ut>u™ is continuous (as a map defined on 


the set of invertible elements). The usual proof is valid. 


Corollary 2.2. Let E, F be Banach spaces. Then the set of toplinear 
isomorphisms of E onto F is open in L(E, F). 
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Proof. Suppose that this set is not empty, and let u:E—F be a 
toplinear isomorphism. Then for v € L(E, F) we have 


u~ty — I| = ju+*(v — u)| S Jum? | |v — ul. 
| < 


If v is close to u, then u'v is close to J, and is invertible by Theorem 
2.1, so there exists w, such that 


w,u'v = Ig. 
Similarly, there exists a toplinear automorphism w, of F such that 
vu-'w, = Ir. 
Thus v has a right inverse and a left inverse, say v,, v,, such that 
vv=T, and vv, = I,. 


Considering v,vv, and using associativity shows that v, = v,, whence v is 
invertible. 


IV, §3. THE LINEAR EXTENSION THEOREM 


Theorem 3.1. Let E be a normed vector space, F a subspace, and G a 
complete normed vector space. Let 


A: F->G 


be a continuous linear map, with norm C. Then the closure F of F in E 
is a subspace of E. There exists a unique extension of 4 to a continu- 
ous linear map 4: F + G, and 4 has the same norm as A. 


Proof. Elements in F are limits of sequences in F. Thus if 


x = lim x, and y = lim jy,, 
then 
x + y = lim(x, + y,) 
and for ceR, 
cx = lim(cx,,). 


Hence F is a subspace of E. 
The uniqueness of 4 is clear from continuity. We show its existence. 
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Let x e F, and let x = lim x, with x, ¢F. Then 
|A(x,,) ~ A(Xm)| = |A(Xn ~~ Xm)| Ss C|x, _ Xml- 


Hence {Ax,} is a Cauchy sequence in G, and since G is assumed to be 
complete, {Ax,,} has a limit in G which we denote by Ax. This value is 
independent of the sequence x,—x, for if x =lim x, with x,¢F, then 
lim Ax, = lim Ax,. If 


yeF and y = lim jy, 
with y, € F, then for cE R, 


x+y = lim(x, + y,) and cx = lim(cx,). 
Hence 


A(x + y) = lim A(x, + y,) = lim(Ax, + Ay,) = lim Ax, + lim Ay, 
= Ax + hy. 


Similarly, 2(cx) = cA(x). Hence 4 is linear, and since for xe F we have 
x = lim x, it follows that 4x = Ax if x e F. Thus / is an extension of 1. 
Finally, we have 
|Ax| = lim|Ax, | 


because the norm is a continuous function. Since 


|AXnl S Clxyl, 
it follows that 
lim|Ax, | < C|lim x,| = C|x\, 


because limits preserve inequalities. This proves that a bound for 4 is 


also a bound for 1 and hence that |A|=|A|. This also concludes the 
proof of Theorem 3.1. 


We shall see examples of Theorem 3.1 very frequently in the sequel, 
notably in the existence proof for the completion of a normed vector 
space, in integration, Chapter VI, §3 and Chapter XIII, §1; and in the 
spectral theorem of Chapter XVIII. 


IV, §4. COMPLETION OF A NORMED VECTOR SPACE 


Let E be a normed vector space. We wish to associate with E a 
complete normed vector space in a manner analogous to that which 
associates the real numbers to the rational numbers. We shall follow 
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the method of Cauchy sequences. For another method, cf. Exercise 25. 
We define a completion of E to be a pair (E, @) consisting of a Banach 
space E and a continuous linear map 


go: E>E 


which is injective, such that @(E) is dense in E, and such that @ preserves 
the norm, i.e. |@x| = |x| for all x e E. We shall now prove that such a pair 
is essentially uniquely determined. In fact, if (F,W) is another completion, 
then there exists a unique invertible element A4€ L(E, F) such that the 
following diagram is commutative, in other words w = 10 @. 


E —— F 
\ / 
E 
The proof is in fact very easy. The map 


yoo ': @(E)>W(E)cF 


is continuous and linear (it even preserves the norm) and consequently, 
by the linear extension theorem, it has a unique continuous linear exten- 
sion of E into F, which we denote by 4. Similarly, the continuous linear 
map 

pow": W(E) > o(E) cE 


has a continuous linear extension of F into E, which we denote by u. 
Then po A: E— E gives the identity when restricted to g(E), and hence is 
equal to the identity on E itself by continuity (or by the uniqueness part 
of the linear extension theorem). Similarly, 4° yu: F > F is the identity. 
This proves the uniqueness of the completion. 


We observe that our toplinear isomorphism / preserves norms, that is 


|Ax| = |x| 

for all xe E. This again follows by continuity. 

We shall now give two proofs of the existence of a completion. So let 
E be a normed vector space and let E’ be its dual. As we saw in 
Proposition 1.3, we have a natural norm-preserving injection E — E”. 
But E” is complete because E” = L(E’, F) with complete F (F = scalars). 
So the completion of E is simply the closure E in E”. (Do Exercise 15.) 

Next we give another proof, based on the same construction as the 
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real numbers from the rational numbers. This construction will be used 
in the integration theory. See the examples after the construction. 

The Cauchy sequences of elements of E form a vector space, which we 
denote by S, As usual, we have the notion of null sequences, that is 
sequences {x,} in E such that given «, there exists N such that for all 
n>N we have |x,| <«. The null sequences form a subspace. We define 
two Cauchy sequences & = {x,} and 7 = {y,} to be equivalent if there 
exists a null sequence «= {a,} such that €=7+« (in other words 
X,=J), +4, for all n). This is an equivalence relation, and we denote 
the equivalence class of € by &. Then the equivalence classes of Cauchy 
sequences form a vector space in a natural way, and we have (for c € R): 


E+n=E+n and  cé&=cé. 


We denote the vector space of equivalence classes of Cauchy sequences 
by E. (It is nothing but the factor space of Cauchy sequences modulo 
the subspace of null sequences.) 

If € = {x,} is a Cauchy sequence and y = {y,} is equivalent to ¢, then 


lim |x,| = lim |y,|. 


no no 


Then we define 


[¢] = lim |x,]. 
n-—co 


It is verified at once that this is a norm of E, which is thus a normed 
vector space. 
We let 


g:E>E 


be the map such that g(x) is the class of the Cauchy sequence {x, x, ...}. 
Then it is clear that @ is linear, and preserves norms. Furthermore, one 
sees at once that if € is the class of a Cauchy sequence é, and x = {x,}, 
then 

E= lim P(X,)- 


no 


Hence @(E) is dense in E. 7 - 
All that remains to prove is that E is complete. To do this, let {é,} 


be a Cauchy sequence in E. For each n there exists an element x, ¢ E 
such that 


\é, — px, | < 1/n, 


because g(E) is dense in E. The sequence {x,} is then Cauchy (in E). 
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Indeed, we have 


|X, ~ Xm — | Px, ~~ OX m| 


Ss |PXp ~~ é,,| + lé, ~ En + |Emn _ OXml> 


which gives a 3é-proof of the fact that {x,} is a Cauchy sequence. Let 
€ = {x,}. Then {é,} converges to ¢, because given ¢, 


En — €1 S16, — Xal + OX, — E| < 2e 


for n sufficiently large. This proves that E is complete, and concludes the 
proof for the existence of a completion of E. 


Example 1. In integration theory, covered later in this book, one 
starts with the vector space of continuous functions, say on [0, 1], with 
the L!-norm 


I fll = \ [F(O| dt. 


One can also take the vector space of continuous functions on R, van- 
ishing outside some bounded interval, and define the L‘-norm similarly. 
Then this space is not complete, and its completion is called L*. It then 
becomes a problem to identify elements of L* with certain functions, and 
this is what we shall do. 


Example 1 points to the need of a slight generalization of our normed 
vector spaces. Indeed, even in elementary integration theory, one deals 
with step functions, or piecewise continuous functions, which are such 
that if || f||, = 0, then f may not be the zero function. For instance, if f 
is 0 except at a finite number of points, then we do have |/f||, =0. In 
view of this, one defines a seminorm on a vector space E to be a function 
satisfying all properties of a norm, except that we require 


|x| 20 


for all xe E, but we allow |x| =0 without having necessarily x = 0. 
Then it is clear that the set of all xe E such that |x| =0 is a subspace 
E,. The terminology of open and closed sets applies in the present 
context, and the topology defined by a seminorm is simply not Haus- 
dorff. In fact, the closure of 0 is obviously the space E, itself. 


In defining the completion, we can just as well define the comple- 
tion of a space with a seminorm. We form Cauchy sequences and null 
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sequences, and we still get a map 
jE OE, 


the only difference being that j has a kernel, which the reader will verify 
to be precisely E,. In fact, we have a norm on the factor space E/E, if 
we define the norm of a coset |x + E,| to be |x| (independent of the coset 
representative x since we have 


Ix + y| = |x| 


for all ye E,). Thus we can say that if E has a seminorm, the comple- 
tion E is simply the completion of E/E, as discussed in this section. 

A vector space E with a seminorm | | can be called a seminormed 
space. We can define Cauchy sequences using the same definition as in 
the normed case. We shall say that E is complete if every Cauchy 
sequence in E converges—in other words, if given a Cauchy sequence 
{x,} in E, there exists xe E such that given ¢, there exists N such that 
for all n= N we have 


|x, —x| <6. 


Of course, the element x to which our sequence {x,} converges is not 
uniquely determined, only up to an element of E,. However, examples of 
this situation arise in practice, in integration theory. One must then 
distinguish between a complete seminormed space, and the completion of 
E/E, mentioned above. | 


Example 2. Let E be the vector space of C® functions (say, real 
valued) on R, vanishing outside a compact set (i.e. infinitely differentiable 
functions f such that f(t) = 0 if t is outside some bounded interval). We 
define the H°-norm on E by 


WF lluo = < KAD, 


where 


hf = | f(o)? dt. 
We define the H?-norm by 
2 ‘ kf 2 
Ifllie = ¥ ID*f lio, 


where D is the derivative. The completion of E under the H?-norm is 
called an H? space. This kind of space is used very frequently in analysis. 
For p = 0, the norm is also called the L?-norm. 
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Example 3. On the interval [0,1], we let C? be the space of functions 
having p continuous derivatives. For f € C? we define 


lf lice = sup ||D*f\. 
ksp 


Then this is a norm. It is an exercise to show that C? is already 
complete under this norm. 


IV, §5. SPACES WITH OPERATORS 


Except for enumerating basic properties, it is rather rare in analysis that 
one meets merely a normed vector space, or a Banach space, just by 
itself. It is usually accompanied by a set of operators, and thus we make 
here some general comments on this situation. 

Let E be a normed vector space. Elements of L(E, E) are also called 
operators on E. Let S be a set of operators on FE. By an S-invariant 
subspace F we mean a subspace such that for every AeS we have 
AF c F, 1c. if xe F and AeS, then Axe F. It is clear that if F is an 
S-invariant subspace, then its closure is also S-invariant because if x, € F 
and x, > x, then Ax, — Ax, so Ax lies in the closure of F. 


An operator B is said to commute with S if AB = BA for all AeS. If 
B commutes with S, then both the kernel of B and its image are S-invariant 
subspaces. 


Proof. If xe E and Bx = 0, then ABx = BAx = 0 for all A eS, so the 
kernel of B is S-invariant. Similarly, also from the relation ABx = BAx, 
we see that the image of B is S-invariant. 


If A is an operator on E, and co, ...,c, are numbers, we may form the 
operator 


p(A) =c,A" + °°: + Col, 
where 
p(t) =c,t" ++°'+ Co 


is the polynomial having the numbers as coefficients. If p, gq are polyno- 
mials and pq denotes the ordinary product of polynomials, then we have 


(p + q)(A)= p(A)+q(A) and — (pq)(A) = p(A)q(A). 
Indeed, if g(t) = b,,t" + °°: + bo, then 


p(t)q(t) = >, dt’, 
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where 


d,, = y C,b,. 


r+s=k 


But 
p(A)q(A) = ¥ d, A* 


since associativity, commutativity, and distributivity hold in multiplying 
powers of A. The statement concerning the sum p+ gq is even more 
trivial to see. Also, if c is a number, then 


(cp)(A) = cp(A). 


All these rules are useful when considering the evaluation of polynomials 
on operators. In algebraic terminology, they express the fact that the 
map 

pt p(A) 


is a ring-homomorphism from the ring of polynomials into the ring of 
operators. 

If F is an A-invariant subspace, then it is clear that F is also p(A)- 
invariant for all polynomials p. Thus if F is in fact a subspace of E 
which is invariant for an operator A, then it is also invariant for the set 
of all polynomials in A, called also the ring of operators generated by A. 
The same holds for any set of operators S, letting the ring of operators 
generated by S be the set of all operators expressed as finite sums 


3 Ci, AY vee Ain, 


where A,, ...,A, are elements of S, and the coefficients are numbers. 
Indeed, if F is A- and B-invariant, then it is also (A + B)-invariant and 
AB-invariant. 

If an operator B commutes with all elements of S, then it is clear that 
B also commutes with all elements in the ring of operators generated 
by S, because if B commutes with A, and A,, then B commutes with 
A, + A, and also with A,A,. Furthermore, if F is a closed subspace 
and is S-invariant, then it is also S-invariant, where S is the closure of 
S. Indeed, if {B,} is a sequence of operators in S converging to some 
operator B, and if xe F, then the sequence {B,x} is Cauchy, and hence 
converges to Bx which lies in F. 

In Chapters XVII and XVIII we study a pair (E, A) consisting of a 
space E and an operator A, and analyze this pair, describing its structure 
completely in important cases. The idea is to apply in the present con- 
text an all-pervasive point of view in mathematics, which is to decompose 
an object into a direct sum of simpler objects. In the present context, let 
us make some general definitions. 
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Let E be a Banach space, and F, G closed subspaces. We know that 
the product F x G consisting of all pairs (y,z) with ye F and zeG is 
also a Banach space, say under the sup norm. If the map 


FxGoE 
given by 
(y,z)roy +2 


is a toplinear isomorphism, then we say that E is the direct sum of the 
subspaces F and G. Observe that our requirements involve both an 
algebraic and a topological condition. It follows from our conditions 
that 


E=F+G and FoG= {0}. 


It will be proved later that, in fact, these two conditions are sufficient; in 
other words, if they are satisfied, then the map 


(y,Z)FReytz 


not only has an algebraic inverse, but this inverse is continuous (corol- 
lary of the open mapping theorem). When E is a direct sum of F and G, 
we write 


E=FQG. 


If A is an operator on E, then we are interested in expressing E as a 
direct sum of A-invariant subspaces. Subsequent chapters give examples 
of this situation. 


APPENDIX: CONVEX SETS 


APP., §1. THE KREIN—-MILMAN THEOREM 


Although we shall not use the theorem of this section later in the book 
(except for some exercises), it is worthwhile giving it since it is used 
at the beginning of more advanced and specialized courses, in a wide 
variety of contexts. The exposition follows that of Artin (cf. Collected 
Works). 


Throughout this section, we let E be a vector space over the reals (not 
normed). We let E* be a vector space of linear maps of E into R (not 
necessarily the space of all such linear maps), and assume that E* separates 
E, that is given x EE, x #0 there exists AE E* such that A(x) #0. We 
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give E the topology having the smallest amount of open sets making all 
4 ¢€E* continuous. A base for this topology is therefore given by the 
following sets: We take xe E, and /,, ...,A,¢€ E*, and e>0. We let B 
be the set of all y e E such that 
|Ai(y) — A(x)| < €. 
The set of all such B is a base for the E*-topology. 
A subset S of E is said to be convex if given x, ye S, the line segment 

(1 —t)x + ty, O<t<l, 

joining x to y is contained in S. 


We observe that an arbitrary intersection of convex sets is convex. 


Lemma 1.1. Let x,, ...,x,¢€S. Any convex set containing X,, ...,Xp 
also contains all linear combinations 


tix, bot tyXp 


with 0 <t, <1 for all i, and t, +-+:: +t, = 1. Conversely, the set of all 
such linear combinations is convex. 


Proof. If t, # 1, then the above linear combination is equal to 


Cy Cn-1 
1-—t Xp, tbo ta X,-1 | + Xn. 
( o(; —t, 1 1—t, n ) n 


‘The first assertion follows at once by induction. The converse is also an 
immediate consequence of the definitions. 


The following properties of convex sets also follow at once from the 
definitions. 


Let 4: E—F be a linear map. If S is convex in E, then A(S) is convex 
in F. If T is convex in F, then 4~'(T) is convex in E. In other words, 
the image and inverse image of a convex set under a linear map are 
convex. — 


Let Ae E*, 4 #0, and let H, be the kernel of 4 (ie. the set of all xe E 
such that A(x) = 0). Then H, is a closed subspace, and if ve E is such 
that A(v) 4 0, then 

E = Ho + Rv. 
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If 4,, 4, are non-zero functionals with the same kernel H, then there 
exists cE R, c #0 such that 4, =cdA,. Indeed, one sees at once that 
c = A,(v)/A2(0). 

Let A #0 be an element of E*, and let ce R. By the hyperplane H. we 
mean the set of all x e E such that A(x) =c. In other words, H, = A~‘(c). 
If Hy is the kernel of A, then H, consists of all elements y+ yo with 
y€ Ho and yo any fixed element of E such that A(y,) = c. 

The set of xe E such that A(x) =c will be called a closed half space 
determined by the hyperplane, and so will the set of all x such that 
A(x) Sc. Similarly, we have the open half spaces, determined by the 
inequalities A(x) > c and A(x) < c respectively. 

If S is a closed subset of E and x, a point, we say that a hyperplane 
H separates S and x, if S is contained in one of the closed half spaces 
determined by H, and x, is not contained in this half space. 


Theorem 1.2. Let S be a closed convex set in E, and let x, €S. Then 
there exists a separating hyperplane for S and Xo, such that S is con- 
tained in a closed half space determined by H. 


Proof. We begin by proving our statement in the finite dimensional 
case. 

Let T be a closed convex subset of R", and let P be a point of R” 
such that P¢T. The function f(X)=|X — P| (euclidean norm) has a 
minimum on T, say at Oc T. Let N=Q-—P. Since P¢T, we have 
N #O. We contend that the hyperplane passing through Q, perpendicu- 
lar to N, will satisfy our requirements. The equation of this hyperplane 
is X:-N=Q-N. Let Q’ be any point of T, and Q’#Q. For every t with 
0<t<1, we have 


IQ—P|s|Q+t(Q'—Q)— P| =|(Q— P)+ 4’ —- OIL. 

Squaring gives 

(Q— P) S$(Q— P)’ + 24(Q — P)-(Q' — Q) + #7(Q' — Q)’. 
Canceling and dividing by t, we obtain 

0 < 2(Q — P)-(Q' — Q) + «(Q’ — Q)’. 
Letting t tend to O yields 
O'-NZ2ZQO:-NZ=P-N+N-N. 

This proves that T is contained in the closed half space defined by 


X'N2ZC, 
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where c= P-N+N-N, thus proving our contention, and the fact that 
our hyperplane separates T and P. 

We return to the general case of the space E. There exists a neighbor- 
hood of x, which does not intersect S. In other words, there exists ¢ and 
Ais -++54, € E* such that all ye E satisfying 


Aly) — A(X) <e (G=1,...,n) 
do not lie in S. Consider the linear map 


go: E> R" 
given by 
xt>(A, (x), ...5An(X)). 


The image of S is a convex set g(S) in R", which does not intersect the 
neighborhood of g(x.) determined by the inequality 


|Q — p(xo)|| <e (sup norm). 


Its closure does not contain (x,). By our result in the finite dimen- 
sional case, there exists a non-zero vector 


N = (cq, ...,€,) € R" 


such that @(S) lies in the closed half spaces determined by N and a 
suitable constant c. We let 


A=C,A, + °°: + 0,4. 


Then / € E* and S is contained in a closed half space 4 2 c, which does 
not contain x,, thus proving Theorem 1.2. 


Remark. All that we need in the sequel is that, the assumptions being 
as in the theorem, there exists a functional 4 ¢ E* such that A(x.) is not 
contained in A(S). 


We define an extreme point of a convex set S to be a point xeS 
having the following property: Whenever y,, y, are points of S such 
that we can write 

x= ty, + (1— ‘yz 


with 0 <t <1, then y, = yp. 


Theorem 1.3. Let S be a non-empty, convex, compact subset of E. Then 
there exists an extreme point of S. 
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Proof. Let ¥ be the family of non-empty, convex, compact subsets of 
E contained in S, and having the following additional property: 


If Ke ¥ and x€K, and if y,, y, €S are such that 


x=ty, +(l—dy2 
with 0<t <1, then y,, y,€ K. 


Then the set S itself is in #%. We can order elements of ¥ by 
descending inclusion, and if {K;};.,; is a totally ordered subfamily, then 
the intersection 

(\ K; 


iel 


is not empty, and clearly is again in ¥%. Hence by Zorn’s lemma, there 
exists a minimal element S. in ¥. We contend that S, consists of one 
point. (This will prove our theorem.) Since elements of E* separate 
points, it will suffice to prove that for each Ae E*, the set A(S,) consists 
of one point. But A(S,) is convex and compact, whence a closed bounded 
interval. Let c be a right end point of this interval. Then the set 
A~'(c) A Sy is non-empty, convex, compact. We contend that it lies in F. 
Let x be an element in 1~'(c) 7 Sy, and suppose that we can write 


x=ty,+(1—ty2 


with y,, ye S and 0<t< 1. Since S,éF, we get y,, y, ES). Applying 
A, we find that 


A(x) = ¢ = ta(y,) + (1 — Ay). 


Since c is an end point of the interval A(S,), it follows that 


A(y1) = A(y2) = ¢. 


Hence y,, y> also lie in A~*(c), and this shows that A~*(c) Sy is in F. 
Since we took S, minimal, we conclude that S, is contained in 14~*(c), 
thereby proving our theorem. 


Corollary 1.4. Let S be as in Theorem 1.3, and let 1e€ E*. Let c be an 
end point of the interval A(S). Then A~*(c)AS contains an extreme 
point of S. 


Proof. The intersection of the hyperplane A~‘(c) with S is non-empty, 
convex, compact, and thus has an extreme point x, with respect to 
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A~*(c) 0S. However, if y,, yz ¢S and 
x=ty, + (1 —t)y2 


with O<t<1, then A(x) =c = td(y,) + (1 — t)A(y2), and hence A(y,) = 
A(y.)=c, so that y,, y,E€A‘(c)AS. From this we conclude that 
y, = y2, and hence that x is also an extreme point of S itself. 


Theorem 1.5 (Krein—Milman Theorem). Let K be a convex, compact 
subset of E. Let S be the set of extreme points of K. Then K is the 
smallest closed convex set containing all elements of S (i.e. the intersec- 
tion of all closed convex sets containing S). 


Proof. Let S’ be the intersection of all closed convex sets containing S. 
Then S’c K, and since K is compact, it follows that S’ is compact. 
Suppose that there exists x) € K but x, ¢ S’. By Theorem 1.2, there exists 
4 € E* such that A(x,) is not contained in the interval A(S’), say 


A(S') < A(Xp). 


Let c be the right end point of the interval A(K). By Corollary 1.4, the 
set A~'(c) 7 K contains an extreme point of K, contradicting the fact that 
A(S) < c, and proving our theorem. 


APP., §2. MAZUR’S THEOREM 


In the applications of Theorem 1.2, one starts frequently with a convex 
set in a Banach space, closed in the norm topology (i.e. the topology 
defined by the norm). In Theorem 1.2, we needed a convex set closed for 
the weak topology defined by a family of functionals. An example of 
such a family is simply the totality of all functionals, continuous for the 
norm topology. Of course, if a set S is compact for the norm topology, 
it is also compact for the weak topology. One can then raise the ques- 
tion whether a closed convex set for the norm topology is also closed for 
the weak topology. The answer is yes: 


Theorem 2.1 (Mazur’s Theorem). Let E be a Banach space and let A 
be a convex subset, closed for the norm topology. Then A is also closed 
for the weak topology (that topology having the smallest amount of open 
sets making all functionals continuous). In fact, A is the intersection of 
all closed half spaces containing A. 


The proof is self contained, and is based on the following lemma. 
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Lemma 2.2. Let U be an open non-empty convex set in E which does 
not contain the origin. Then there exists a functional 4 on E whose 
kernel does not intersect U. 


Proof. Let ae U. Then —aé¢ U, otherwise 0 € U because U is convex, 
and this is impossible. By a cone we shall mean a subset C of E such 
that if x e C, then tx eC for all real t => 0. Let T be the set of all convex 
cones containing U but not —a. Then [ is not empty because the set 
of all points tx with t20 and xeU, is verified to be a convex cone 
directly from the definitions, and belongs to IT. It is clear that I is 
inductively ordered by ascending inclusion. Let C be a maximal element 
of IT. We contend that Ca(—C) is a closed hyperplane H which does 
not intersect U. Picture: 


First we prove that the maximal cone C is closed. Suppose C is not 
closed. Then we must have —aeC, for otherwise we have Cc Cel 
and C £C, contradicting the maximality of C. On the other hand, we 
have aeU cC. Since U is open, there is a ball Bc U centered at a 
and of radius r>0. But C is convex. Therefore C contains the set A of 
elements (—a + x)/2 with x € B. It is easy to see that A contains the ball 
centered at the origin and of radius r/2. This and the fact that C is a 
cone imply that C = E, a contradiction. It follows that H = Cn(—C) 1s 
closed, is a cone, is convex, and H = —H. Therefore H is immediately 
seen to be a closed subspace. We have H #E because —a€C, so 
—aéH. 

We have E=Cwu(-—C). To see this, let xe E and suppose x €C, 
x ¢—C. Since C is maximal, the cone consisting of all elements c + tx 
with ce C, t2=0 contains —a, and so does the cone of all elements 
c+ t(—x), ce C, t 20. Hence we can write 


—-a=C, + f,;X =Cy — 2X 
with c,, c, EC and t,, t, 2 0. Consequently 


C, + (ty + t,)x EC. 
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However, c, + t,x = —a is on the line segment between c, and c, + 
(t, +t,)x, and thus lies in C, a contradiction which proves that E = 
Cu(—C). 

Now suppose that xe C. Then the line segment between x and —a 
contains a point of H. For instance, on the segment x + t(—a — x) with 
0O<t<1, let t be the sup of all t such that x + t(—a— x) hes in C. 
Then x + t(—a— xx) lies in H, and t #1, otherwise —aeéH, which is 
impossible. We therefore have 


(1 —t)x —ta=heH, 
whence 


Working also with —x instead of x, we conclude that E is generated by 
H and a, so that the factor space E/H has dimension 1 and hence H is a 
closed hyperplane. 

Finally, H does not intersect U, for otherwise let he HAU. Since U 
is open, for small s >0 we have h—saeU soh—saeC. But —heC, 
whence —saeC and —aeC, which is impossible. This proves our 
lemma. 


We now prove: Let b be a point of a Banach space E, which does not 
belong to the norm-closed non-empty convex set A. Then there exists a 
functional 1 and a number a such that A(x)>a for all xe A and 
A(b) < a. 


Proof. Let B be an open ball centered at b and not intersecting A. 
Then the set U = A — B, consisting of all points x — y with xe A and 
ye B, is open, convex, non-empty, and does not contain the origin. (U 
is open because it is a union of open sets a—B with ae A, and it is 
immediately verified to be convex because the sum of two convex sets is 
convex.) We apply our lemma to U and find a functional 4 as in the 
lemma, so that Az 20 for all ze A — B, and therefore Ax = Ay for all 
xeE€A, yeB. Let B=infAx for xe A. The map J is an open map- 
ping, for instance because / gives an isomorphism of a one-dimensional 
subspace of E onto R. Therefore Ay<f for all ye B, so that in 
particular, Ab < B. We let « =4(Ab + B) to conclude the proof of our 
assertion. 


Mazur’s theorem follows at once, since we have proved that a non- 
empty closed convex set is the intersection of all closed half spaces 
containing it. 
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IV, §6. EXERCISES 


1. Fill in the details that if F is complete, then L(E, F) is complete. 


2. Show that the Hahn-—Banach theorem for the complex case follows easily 
from the real case of this theorem. In other words, finish the details of the 
argument given after Corollary 1.2. 


3. Let E, F, G be normed vector spaces. A bilinear map 4: E x F >G is a map 
which is linear in each variable, ie. for each xe E the map ytA(x, y) Is 
linear, and for each ye F, the map xt A(x, y) is linear. Show that a bilinear 
map / is continuous if and only if there exists C > 0 such that 


[A(x, yl S ClxI[y! 


for all xe E, ye F. Let L(E, F; G) be the set of continuous bilinear maps of 
E x F into G. Show that L(E, F; G) is a normed vector space, if the norm of 
A is defined to be the inf of all numbers C as above. Show that if G is 
complete, then L(E, F; G) is complete. 


4. Let E be a Banach space and F a closed subspace. For each coset x + F of 
F, define |x + F| = inf|x + y| for ye F. Show that this defines a norm on the 
factor space E/F, and that the natural map E — E/F is continuous linear. (Cf. 
Chapter XV, §1.) 


5. Let A be a Banach algebra. Suppose that there is a unit element e 40, but 
that we do not necessarily have |e| = 1. 
(a) Show that |e| = 1. 
(b) Define a new norm || || on A by putting 


—  [xyl 
||| = sup -—. 
yz0 yl 
Show that || || is in fact a norm and that |le|| = 1. 


(c) Show that A is a Banach algebra under this new norm. 


6. (a) Show that a finite dimensional subspace of a normed vector space is 
closed. 
(b) Let E be a Banach space and F a finite dimensional subspace. Show that 
there exists a closed subspace G such that F + G = E and 


FAG = {0}. 


You will have to use the Hahn—Banach theorem. 


7. Let F be a closed subspace of a normed vector space E, and let ve E, 
v¢F. Show that F + Ru is closed. If E = F + Rv, show that E is the direct 
sum of F and Rv. (You can give a simple ad hoc proof for this. A more 
general result will be proved later as a consequence of the open mapping 
theorem.) 
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10. 


11. 
12. 


13. 


14. 


15. 


16. 


BANACH SPACES [IV, §6] 


. Let E, F, G be normed vector spaces and assume that G is complete. Let 


4: E x F +G be a continuous bilinear map. Show that 4 can be extended to 
a continuous bilinear map of the completions E x F + G, which has the same 
norm as A. (Identify E, F as subspaces of their completions.) 


. Let A be a Banach algebra, commutative, and with unit element. Let J be an 


ideal. Show that the closure of J is also an ideal. (The definition of an ideal 
is the same as in the case of rings of continuous functions. If the algebra 1s 
not commutative, then the same result is valid if we replace ideals by left 
ideals.) 


Let A be a commutative Banach algebra and let M be a maximal ideal. 
Show that M is closed. 


Give the proof of the inequality left to the reader in Proposition 1.3. 


Let E be an infinite dimensional Banach space, and let {x,} be a sequence of 
linearly independent elements of norm 1. Show that there exists an element 
in the closure of the space generated by all x, which does not lie in any 
subspace generated by a finite number of x,. [Hint: Construct this element 
as an absolutely convergent sum ) c,x,.-] 


Let {E,} be a sequence of Banach spaces. Let E be the set of all sequences 
& = {x,} with x,¢E, such that ) |x,| converges. Show that E is a vector 
space, and that if we define 


[el = > [ml 


then this is a norm, and E is complete. 


Let E be a Banach space, and P, Q two operators on E such that P+ Q = IJ, 
and PQ = QP = O. Show that 


E = Ker P + Ker Q, 


and that Ker P=Im@Q. Show that Ker Po Ker Q = {O}, and that Ker P 
and Im P are closed subspaces. 


Let E be a Banach space and let F be a vector subspace. Let F be the 
closure of F. Prove that F is a subspace, and is complete. 


Let A be a subset of a Banach space. By c(A) we denote the convex closure 
of A, i.e. the intersection of all convex sets containing A. We let c(A) denote 
the closure of c(A). Then ¢(A) is convex. Prove: If K is compact, then c(K) 
is also compact. [Hint: Show c(K) is totally bounded as follows. First find a 
finite number of points x,, ...,x, such that K is contained in the union of 
the balls of radius ¢ around these points. Let C be the convex closure of 
the set {x,,...,x,}. Show that C is compact, expressing C as a continuous 
image of a compact set. Let y,, ...,y,, be points of C such that C is con- 
tained in the union of balls of radius ¢ around these points. Then get the 
desired result. | 
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17. 


18. 


19. 


Let F be the complete normed vector space of continuous periodic functions 
on [—7z, 2] of period 2x, with the sup norm. Let E be the vector space of all 
real sequences a = {a,} such that )' |a,| converges. Define 


Show that this is a norm on E. Let 
La(x) =} a, cos nx, 


so that L: E— F is a linear map. Show that L has norm 1. Let B the closed 
unit ball of radius 1 centered at the origin in E. Show that L(B) is closed in 
F. [Hint: Let {f,} (k =1,2,...) be a sequence of elements in L(B) which 
converges uniformly to a function f in F. Let b, be the Fourier coefficient 
of f with respect to cosnx. Let B = {b,}. Show that f is in E and that 


L(B) = f.] 


Let K be a continuous function of two variables defined for (x, y) in the 
square [a,b] x [a,b]. Assume that ||K|| < C for some constant C > 0, where 
| | is the sup norm. Let E be the Banach space of continuous functions on 
[a, b], and let T: E > E be the linear map such that 


b 
Tg(x) -| K(t, x)g(t) dt. 


a 


Show that T is bounded and ||T|| < C(b— a). For more on T, see Chapter 
XIV, Exercise 5. 


Let A be a commutative Banach algebra with unit element e, over the reals, 
and define the exponential and logarithm maps by 


2 


u 
expu=1+ut to 


and 
(u—e) (u—e)? 
2 3 


log u = (u — e) — 


Show that exp converges absolutely for all ue A, and that log converges 
absolutely for all u with |u — e| <1. Show that the exp and log give inverse 
continuous mappings from a neighborhood of 0 onto a neighborhood of e in 
A. Show that they satisfy the usual function equations 


exp(u + v) = (exp u)(exp v), 
log(uv) = log u + log v, 


in these domains of definition. Show that every element of A sufficiently close 
to e is an n-th power for every positive integer n. 
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20. Let X be a compact Hausdorff space and let C(X) be the Banach space of 
real continuous functions on X. If 4 is a functional on C(X) (sup norm) such 
that A(1) = |A|, show that / is positive, in the sense that if feC(X), f 2 0, 
then A(f) 2 0. 


CHAPTER V 


Hilbert Space 


V, §1. HERMITIAN FORMS 


Essentially all of this chapter goes through over the real or the complex 
numbers with no change. Since the theory over the complex does intro- 
duce the extra conjugation, we use the complex language, and point out 
explicitly 1n one or two instances those results which are valid only over 
the complex. 

Let E, F be vector spaces over C and let L: E— F be a map. We say 
that L is antilinear, or semi-linear, if L is R-linear, and L(ax) = “L(x) for 
all xe E and weC. 

Let E be a vector space over the complex numbers. A sesquilinear 
form or scalar product on E is a map 


ExEoC 
denoted by 
(x, y)r> <x, y> 


which is linear in its first variable, and semi-linear or antilinear in its 
second variable, meaning that for x, y, y,, y, € E, aE C, we have 


CX Vi + V2 = OUP +X Y2> = and Cx, ay) = CX, yd. 
If in addition we have for all x, ye E 
{X, YD = CY, X)s 


we say that the form is hermitian. If furthermore we have <x, x> = 0 for 
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all xe E, we say that the form is positive. We say the form is positive 
definite if it is positive, and <x, x> > 0 if x #0. We shall assume through- 
out that our form < , > is positive, but not necessarily definite. We ob- 
serve that a sesquilinear form is always R-bilinear. 

We define v to be perpendicular or orthogonal to w if <v,w> =0. Let 
S be a subset of E. The set of elements v€ E such that <v,w> =0 for 
all we S is a subspace of E. This is easily seen and will be left as an 
exercise. We denote this set by S+. Let E, consist of all elements ve E 
such that ve E+, that is <v, w>) = 0 for all we E. Then E, is a subspace, 
which will be called the null space of the hermitian product. 


Theorem 1.1. If weE is such that <w,w)> =0, then we Eo, that is 
<w, v> = 0 for all ve E. 


Proof. Let t be real, and consider 


0< <v + tw, v + tw> = Cv, vd + 2t Re<v, w> + t?<w, w> 
= ¢v, v> + 2t Re<v, w>. 


If Re<v, w> #0 then we take t very large of opposite sign to Re<v, w>. 
Then <v, v> + 2t Re<v, w> is negative, a contradiction. Hence 


Re<v, w> = 0. 


This is true for all ve E. Hence Re<iv, w> = 0 for all ve E, whence 
Im<v, w> = 0. Hence <v, w> = 0, as was to be shown. 


We define |v| = ./<v, v>, and call it the length or norm of v. By 
definition and Theorem 1.1, we have |v| = 0 if and only if ve Eo. 


Theorem 1.2 (Schwarz Inequality). For all v, we E we have 
Xv, w>| S lvl |wl. 
Proof. Let « = <w, w> and B = —<v, w>. We have 


0< <av + Bw, av + Bw> 
= Cav, av> + (Bw, av> + Cav, Bw> + < Bw, Bw> 
= aadv, v> + Baw, v> + adv, w> + BB<w, wy. 


Note that « =|w|*. Substituting the values for «, B, we obtain 


0 < |w|*|v|* — 2|wl*<v, w><v, w> + |wl*<v, w> <2, Ww). 
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But 
<v, w><v, w> = {Xv, w>/?. 


[wl Xv, wl? S |wl*]o|?. 


Hence 


If |w| =0, then we E, by Theorem 1.1 and the Schwarz inequality is 
obvious. If |w| 40, then we can divide this last relation by |w|*, and 
taking the square roots yields the proof of the theorem. 

Theorem 1.3. The function vt>|v| is a seminorm on E, that is: 

We have |v| = 0, and |v| = 0 if and only if ve Eo. 

For every complex a, we have |av| = |a||v|. 

For v, we E we have |v + w| S |v| + |wl. 

Proof. The first assertion follows from Theorem 1.1. The second is 


left to the reader. The third is proved with the Schwarz inequality. It 
suffices to prove that 


|v + wi? S (jvo| + |wI)?. 
To do this, we have 


lv + wl? = Cv + wi, 0+ w> = (0, v> + (wy 0d + Cv, w> + Cw, >. 
But <w, v> + <v, w> = 2 Re<v, w> S 2|<v, w>|. Hence by Schwarz, 


|v + wl? S |v|? + 2|<v, w>| + |w/? 
< |v|? + 2|v]|w] + |wl? = (Jo] + |wI)?. 


Taking the square root of each side yields what we want. 


We call | | the L?-norm (or we should really say the L?-seminorm). 

An element of E is said to be a unit vector if |v| = 1. If |v| 40, then 
v/|v| is a unit vector. 

Let we E be an element such that |w| #0, and let ve E. There exists 
a unique number c such that v — cw is perpendicular to w. Indeed, for 
v — cw to be perpendicular to w we must have 


<v — cw, w> = 0, 


whence <v, w> — <cw, w> = 0 and <v, w>) = c<w, w>. Thus 


_ <9) 
c= tw, wy 


98 HILBERT SPACE [V, §1] 


Conversely, letting c have this value shows that v — cw is perpendicular 
to w. We call c the Fourier coefficient of v with respect to w. 

Let v,, ...,v, be elements of E which are not in E,, and which are 
mutually perpendicular, that is ¢v,,v;> =0 if i# j. Let c; be the Fourier 
coefficient of v with respect to v;. Then 


0U— C0, — C70V4 —""'—C,0, 


is perpendicular to v,, ...,v,. Indeed, all we have to do 1s to take the 
product of v with v;. All the terms involving <v;,v;> will give 0, and we 
shall have two terms 


Cv, Vj > — CX U;, U;> 


which cancel. Thus subtracting linear combinations as above orthogo- 
nalizes v with respect to v,, ...,0,. 
We have two useful identities, namely: 


The Pythagoras Theorem. If u, we E are perpendicular, then 
Ju + wi? = |u|? + |wl?. 
The Parallelogram Law. For u, we E, we have 
lu + w|? + lu — wl? = 2|ul? + 2|wl?. 


The proofs come immediately from expanding out the norm according to 
the definitions. 

Let {v;};-; be a family of elements of E such that |v,| 40 for all i. 
For each finite subfamily, we can take the space generated by this sub- 
family, i.e. linear combinations 


CyU;, Fo + Cyd; 


with complex coefficients c;. The union of all such spaces is called the 
space generated by the family {v;};.,;. Let us denote this space by F. We 
say that the family {v,} is total in E if the closure of F is equal to all 
of E. 

As a matter of notation, we shall omit the double indices and write 
V1, +++, instead of v;,, ...,0;,. 

We say that the family {v,;} is an orthogonal family if its elements are 
mutually perpendicular, that is <v;,v,> =0 if i#j, and if in addition 
|v;| #0 for all i. We say that it is an orthonormal family if it is ortho- 
gonal and if |v,| = 1 for all i. One can always obtain an orthonormal 
family from an orthogonal family by dividing each vector by its norm. 
A total orthonormal family is called a Hilbert basis, or also an ortho- 
normal basis. (Warning: It is not necessarily a “basis” in the sense of 
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abstract algebra, i.e. not every element of the space is a linear combina- 
tion of a finite number of elements in a Hilbert basis.) 


Theorem 1.4. Let {v;} be an orthogonal family in E. Let x € E and let 
c; be the Fourier coefficient of x with respect to v;. Let {a;} be a 
family of numbers. Then 


n 
X— Yo Cyd, 
1 


Proof. We know that 
xX — y Ci Vz 
k=1 


is orthogonal to each v;, i= 1, ...,n. Hence we get from Pythagoras: 


2 2 


n n 
x— VY ry, t+ YS (Cy — a&)O% 
k=1 k=1 


= |x — » C4,V,|7 + » (Cy — Ay) V,\*- 


n 
x— > AY, 
k=1 


This proves the desired inequality. 


A pre-Hilbert space is a vector space with a positive definite hermitian 
form. If we start with a space with a form which is only positive (not 
definite), we can obtain a pre-Hilbert space by taking the factor space 
E/E, (i.e. equivalence classes of elements of E modulo E,). Similarly, we 
can form the completion of E. Viewing E as a space over the reals, we 
can extend the R-bilinear form < , ) to the completion. If E is a pre- 
Hilbert space, then the extended form is hermitian positive definite. 
(That it is hermitian positive follows by continuity. For the definiteness, 
if {x,} is a sequence converging to x, and x #0, we may assume that 
x, £0, and then that {x,/|x,|} converges to x/|x|. Thus we may deal 
with unit vectors, whence the definiteness follows immediately.) 

A Hilbert space is a vector space with a positive definite hermitian 
form, which is complete under the corresponding L?-norm. Thus we see 
that the completion of a pre-Hilbert space is a Hilbert space. 


Lemma 1.5. Let E be a Hilbert space, and F a closed subspace. Let 
x € E and let 
a = inf |x — y|. 
yeF 


Then there exists an element y,) € F such that 


a= |x — Vol. 
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Proof. Let {y,} be a sequence in F such that |y, — x| approaches a. 
We show that {y,} is Cauchy. By the parallelogram law, we have 


Vn — Val? = 21Y_ — XI? + 2|Ym — XI? — 41302 + Ym) — XI? 
<2\y, — x|* + 2|¥_ — x|? — 4a? 


because of the definition of a. This shows that {y,} is Cauchy, and thus 
converges to some vector yg. The lemma follows by continuity. 


Theorem 1.6. Let F be a closed subspace of the Hilbert space E, and 

assume that F < E. Then there exists an element z € E, z #0, such that 

z is perpendicular to F. 

Proof. Let xe E and x¢ F. Let yy €F be at minimal distance from x 
(by the lemma), and let a be this distance. Let z=x— yo. Then z #0 
since F is closed. For all ye F, y #0 and complex a, we have 

Ix — yol? S |x — yo + ayl? 
whence, expanding out, we obtain 


OS acy, z> + az, y> + aay, y>. 


F 


Figure 5.1 


We let « =t<z, y>, with t real 40. We can then cancel t and get a 
contradiction for small t, if <y,z> #0. This proves the theorem. 


Corollary 1.7. Let E be a Hilbert space, E 4 {0}. Then there exists a 
total orthogonal basis for E. 


Proof. Let S be the set of non-empty orthogonal families. If F,, F, 
are orthogonal families, we define ¥, < F, if F,< ¥,. This gives an 
inductive ordering. Let @ be a maximal element, and let F be the 
subspace generated by %. We contend that F is dense in E. Otherwise, 
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F # E, and by the theorem there exists ze E, z #0 and z perpendicular 
to F. We can then obtain a bigger orthogonal family than %, a contra- 
diction which proves our corollary. 


Corollary 1.8. Let E be a Hilbert space, and F a closed subspace. Then 
E=F+F°. 


Proof. If y,¢F and z,¢F-*, then the sequence {y, + z,} is Cauchy if 
and only if {y,} is Cauchy and {z,} is Cauchy (by the Pythagoras 
theorem). Hence F + F+ is closed. If F + F+#E, then there exists 
we E, w #0, which is perpendicular to F + F*+, whence perpendicular to 
F, so that we F+, a contradiction which proves the corollary. 


We observe that if F is a closed subspace, then F‘+ =F. For any 
x € E, we can write uniquely 


x=y+z 
with ye F and ze F*. The map P: E> E such that 
Px=y 


is called the orthogonal projection on F. It is obviously a continuous 
linear map, and we study such maps in greater detail in Chapter XVIII, 


§5. 


Corollary 1.9. Let E be a Hilbert space. Let {F;} (i =1,2,...) be a 
sequence of closed subspaces which are mutually perpendicular, that is 
F,LF ifiA#j. Let F be the closure of the space F generated by all F,. 
(In other words, F is the closure of the space F consisting of all sums 


Xp tr tXyq, x; € F;.) 


Then every element x of F has a unique expression as a convergent 
series 


Let P, be the orthogonal projection on F,. Then x; = Px, and for any 
choice of elements y, € F, we have 
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Proof. Since 


is orthogonal to F,, ...,F, we can use exactly the same argument as in 
Theorem 1.4, and the Pythagoras theorem to show the last inequality, 
writing 

n 2 2 

x— >) Px] + 


Y (Rx — yd 


There exists a sequence from F which approaches x. It therefore follows 
that the partial sums 


must approach x also. If 


with x;¢F,, then we apply the projection P, (which is continuous!) to 
conclude that P,x = x,, thus proving the uniqueness. 


It is convenient to call the family {F} an orthogonal decomposition of 
F in the preceding theorem. If F = E, then we call it an orthogonal 
decomposition of E, of course. 

Suppose that the Hilbert space E has a denumerable total family {v,}, 
which we assume to be orthonormal. Then every element can be written 
as a convergent series 


where a, is the Fourier coefficient of x with respect to v,, and the 
convergence is of course with respect to the L?-norm. Namely, we take 
the spaces F, in the previous discussion to be the 1-dimensional spaces 
generated by v,. In particular, we see that )’ |a,|? converges, and that 


oe) 
|x|? = > Ia, |*. 
n=1 


If {v,} is merely an orthonormal system, not necessarily a Hilbert basis, 
then of course we don’t get the equality, merely the inequality 


20 
>» lanl? S |x|’. 
n=1 


This is called the Bessel inequality, and it is essentially obvious from 
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previous discussions. For instance, for each n we can write 


n n 
v=vU0— y, a,v, + > a, Vy 
k=1 k=1 


and apply Pythagoras’ theorem. 

Conversely, we can define directly a set 1? consisting of all sequences 
{a,} such that > |a,|? converges. If « = {a,} and B = {b,} are two se- 
quences in this space, then using the Schwarz inequality, on finite partial 
sums, one sees that 

» lanPn 


converges, whence we can define a product 


<a, B> = Y) ayby. 


Again from the above convergence, we conclude that /* is in fact a vector 
space, because 


Ylang + Bl? SY lagl? + ¥ 2|a,b,| + ¥ 1,7. 


Furthermore, this product is a hermitian product on it. Finally, it is but 
an exercise to verify that /* is complete. Indeed, the family {v,} is total, 
orthonormal in the completion of 17, and in this completion any element 
can be expressed as a convergent series, described above. Thus the ele- 
ments of the completion are precisely those of 17. 

The space /? can also be interpreted as the completion of a space of 
functions, those periodic of period 27, say, a total orthogonal family then 
being constituted by the functions 


An(t) = e' 


where n ranges over all integers (positive, negative, or zero). 

It is clear that any two Hilbert spaces having denumerable ortho- 
normal total families are isomorphic under the map which sends one 
family on the other. Indeed, if G is another Hilbert space with total 
orthonormal family {e,}, then the map 


Y a,U0,'7 Y Anen 


is linear and preserves the norm. In this way, we get a map from our 
space of periodic functions into 17, which is injective and preserves the 
norm. It extends therefore uniquely to the completion. 

In general, if two Hilbert spaces have total orthonormal families with 
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the same cardinality, then any bijection between these families extends to 
a unique norm-preserving linear map of one space to the other. 


V, §2. FUNCTIONALS AND OPERATORS 


Theorem 2.1. For every y in the Hilbert space E, the map 4, such that 
A,(x) = <x, y> is a functional. The association 
yroa, 


is a norm-preserving antilinear isomorphism between E and its dual space 
E’ | 


Proof. The Schwarz inequality shows that |A,| S |y|, and evaluating 4, 
at y shows that |/,| =|y|, so we get a norm-preserving semi-linear map 
of E into E’, semi-linear because of the hermitian nature of the scalar 
product, namely for complex «, 


Igy = Bdy. 


There remains to show that every functional comes from some ye E. Let 
A be a functional, and let F be its kernel (the closed subspace of all x 
such that A(x) =0). If F #E, there exists ze E, z #0 such that z is 
perpendicular to F (by Theorem 1.6). We contend that some scalar 
multiple of z achieves our purpose, say az. A necessary condition on @ is 
that 

<z, aZ> = A(z) 


or in other words, « = A(z)/<z, z>. This is also sufficient. Indeed, for any 
x € E, we can write 
A(x) A(x) 


X=xX-—-~Z+ 


A(z)" A(z) 
and 


lies in F. Taking the product with az, we obtain 
<x, aZ) = A(x) 
thus proving our theorem. 


By an operator we shall mean a continuous linear map of E into itself. 
As we know, the space of operators End(E) is a Banach space. 
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By Herm(E) we denote the set of all continuous hermitian forms on E. 
By Sesqu(E) we denote the set of all continuous sesquilinear forms on E. 
It is immediately verified that both these sets are in fact Banach spaces, 
and that Herm(E) is a closed subspace of Sesqu(E). We shall now relate 
continuous sesquilinear forms on E and operators. 

Let A: EE be an operator. We define g, by 


Pa(X, y) = (AX, y>. 


Then q@, is obviously a continuous sesquilinear form on E. Conversely, 
let @ be such a form. For each ye E the map 


xt —(x, y) 


is a functional, and consequently there exists a unique y* € E such that 
for all x € E we have 

p(x, y) = <x, y*>. 
The map y+ y* is immediately verified to be linear, using the uniqueness 


of the element y* representing go. Furthermore, from the Schwarz in- 
equality, we find that 


ly*| S$ lollyl. 
If we define A*: E-E to be the map such that A*y = y*, then we 
conclude that A* is a continuous linear map of E into itself, ie. an 
operator. 
On the other hand, if we define W(y, x) = (x, y), then w is sesquilinear 


continuous, and by what we have just seen, there exists a unique opera- 
tor A such that W(y, x) = <y, Ax>, or in other words 


p(x, y) = <Ax, y>. 
Thus @ = 9, for some A. 
Theorem 2.2. The association 
At> 04 


is a norm-preserving isomorphism between End(E) and the space of 
continuous sesquilinear forms on E. 


Proof. All that remains to be proved is that |A| = |q@,|. But 


lpalx, Y)| S |All xl ly! 
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so that |g,| <|A|. Conversely, we know that |Ax| = |A,,| and 


[Aax(Y)] S lal ll lyl- 


Hence |Ax| <|q@,||x|. This proves that |A| <|@,], whence our theorem 
follows. 


We have also shown that to each operator A we can associate a 
unique operator A”* satisfying the relations 


(Ax, y> = <x, A*y) 


for all x, ye E. We call A* the adjoint of A (transpose of A if our 
Hilbert space is over the reals). 


Theorem 2.3. The map At> A?® satisfies the following properties: 
(A + B)* = A* + B*, A** = A, 
(~A)* = aA*, (AB)* = B*A*, 
and for the norm, 
|A*|=|A|, — |A*A| = |Al?. 


Proof. The first four properties are immediate from the definitions. 
For instance, 


<aAx, y> = (Ax, ay) = <x, A*y) = Cx, HA* >. 


From the uniqueness we conclude that (aA)* = #A*. The others are 
equally easy, and are left to the reader. As for the norm properties, we 
have 
|<A*x, y>| = |<x, AyD] S [AI IxI[y! 
so that 
lpa+| = |A*| S|Al. 


Since A** = A, it follows that |A| < |A*| so |A| =|A*|. Finally, 


|A*A| S|A*||A| = |Al’, 


and conversely, 
|Ax|? = <Ax, Ax> = (A*Ax, x> S |A*A]|x|? 


so that |A| <|A*A|!. This proves our theorem. 
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If @ is a continuous sesquilinear form on E, we define the function 


q(x) = p(x, x) 


to be its associated quadratic form. In the complex case, we can recover 
the sesquilinear form from the quadratic form. We phrase this in terms 
of operators. 


Theorem 2.4. For a complex Hilbert space, if A is an operator and 
<Ax, x> =0 for all x, then A = O. 


Proof. This follows from what is called the polarization identity, 


<A(x + y), X + y>) — <A(x — y), x — yD = 2[CAX, yD + AY, x>]. 


Under the assumption of Theorem 2.4, the left-hand side is equal to 0. 
Replacing x by ix, we get 


(Ax, y> + <Ay, x> = 9, 
iX Ax, y> — ix Ay, x» = 0. 


From this it follows that <Ax, y> = 0 and hence that A = O. 


Theorem 2.4 is of course false in the real case, since a rotation is not 
necessarily O, but may map every vector on a vector perpendicular to it. 
However, in Chapter XVIII we shall deal with the case when A = A%*, in 
which case the result remains true, obviously. 

Operators A such that A = A* are called hermitian, or self adjoint. 
We shall study these especially in Chapter XVIII. 


V, §3. EXERCISES 


For the first two exercises, recall that a sequence {x,} in a Hilbert space H 
converges weakly to 0 if for all ye H we have lim<x,, y> = 0. 


1. Let {v,} (n = 1, 2,...) be a denumerable Hilbert basis for the Hilbert space H. 
Show that the sequence {v,} converges weakly to 0, and hence that the unit 
sphere is not closed in the unit ball for the weak topology. 


2. Suppose the Hilbert space H has a countable basis. Let x eH be such that 
|x| <1. Show that there exists a sequence {u,} in H with |u,| = 1 for all n 
such that {u,} converges weakly to x. 
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3. Let X be a closed convex subset of a Hilbert space. Show that there exists a 
point in X which is at smallest distance from the origin. 


4. Let E be a Hilbert space, and let {x,} be an orthonormal basis. Let {c,} be a 
sequence of positive numbers such that )’ c? converges. Let C be the subset of 
E consisting of all sums }' a,x, where |a,| <c,. Show that C is compact. 


5. Show that a Hilbert space is separable (has a countable base for the topology) 
if and only if it has a countable orthonormal basis. 


6. Let A be an operator on a Hilbert space. Show that 
Ker A = (Im A*)". 


7. Let E be the vector space of real valued continuous functions on an interval 
[a,b]. Let K = K(x, y) be a continuous function of two variables, defined on 
the square ax x<b and a<y<b. An element f of E 1s said to be an 
eigenfunction for K, with respect to a real number r, if 


b 
f(y=r | K(x, y)f(x) dx. 


a 


We take E with the L?-norm of the hermitian product given by 


f, 9> -| Sg. 


Prove that if f,,...,f, are in E, mutually orthogonal, and of L?-norm equal to 
1, and if they are eigenfunctions with respect to the same number r, then n is 
bounded by a number depending only on K and r. [Hint: Apply Bessel’s 
inequality. ] 


8. Let E be a pre Hilbert space. 
(a) If E is complex, then Im<x, y> = Re<x, iy>. 
(b) Let x, ye E. If E is real, then 


<x, y> = (Ix + yl? — |x? — lyl?). 
If E is complex, then 
<x, y> = H(Ix + yl? — |xI? — lyl?) + 21x + yl? — |xl? — lyl?). 
(c) Let F be a normed vector space such that the parallelogram law holds for 


its norm. Define <x, y> by the formula in (b). Show that this is a positive 
definite scalar product. 


PART THREE 


Integration 


This part deals with integration in multiple contexts. We start with the 
integral on arbitrary measured spaces, setting the basic framework in a 
context which makes its structure particularly clear. The main idea is 
that one starts the theory of the integral by defining the integral on a 
natural space of simple functions where one sees immediately what the 
integral means. The space of step functions is the one which covers all 
cases, from the most general to the most special. As we shall also see, if 
one wants integration on the reals, or in euclidean space, then the space 
generated by characteristic functions of intervals or cubes, or the C® 
functions with compact support, also form a natural starting space for 
integration. 

It turns out that for the basic framework of integration, all one needs 
for the space of values is linearity and completeness, so a Banach space. 
I think it obscures matters to assume (as is often done) that values are 
first taken in the real numbers, and to make abusive use of the ordering 
properties of the reals and of positivity in setting up the integral. Fur- 
ther comments on this will be made in Chapter VI, especially the intro- 
ductory comments. 

However, doing general Banach valued integration on measured spaces 
does not mean that one eventually slights special properties of complex 
valued integration over the real numbers. This entire part will mix gen- 
eral considerations with particular situations and examples, especially on 
euclidean space and the real line. Readers can see how having the gen- 
eral machinery of integration on measured spaces, or locally compact 
spaces, is used to make easier the formulation of more concrete results. 
For instance, in Chapter VIII, we give specific results on approximations 
on R or R" with Dirac sequences and families. In Chapter IX, two 
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sections on functions of bounded variations and the Stieltjes integral 
illustrate the general relationships between measures and functionals on 
C®” functions with compact support. They also emphasize what is pecu- 
liar to the real numbers, as distinguished from what holds when the 
values are taken in an arbitrary Banach space. 

Thus, throughout this part, we see general integration theory on mea- 
sured spaces alternate with special features on euclidean spaces or on the 
real line. 


CHAPTER VI 


The General Integral 


In this chapter we develop integration theory. We want two things from 
an integral which are not provided by the standard Riemann integral of 
bounded functions: 


(1) We want to integrate unbounded functions. 
(2) We want to be able to take limits under the integral sign, of a 
fairly general nature, more general than uniform limits. 


To achieve this, we proceed in a manner entirely similar to the manner 
used when extending the integral to the completion of a space of step 
functions, except that instead of the sup norm we use the L’-norm. 
Simple and basic lemmas then allow us to identify elements of the com- 
pletion with actual functions, and all properties of the integral then 
become just as easy to prove as in the earlier versions of integration. 
The lemmas are designed to show that if in addition to L'-convergence 
we require pointwise convergence almost everywhere, then we still re- 
cover essentially the L'-completion, up to functions which vanish almost 
everywhere. 

The treatment here is a conglomerate of various treatments in the 
literature. Unlike most treatments, however, I have based the existence 
and definition of the integral on a very simple lemma, which I call the 
fundamental lemma of integration (Lemma 3.1). It can be proved ab ovo 
with a very short proof, and shows immediately how an L'-Cauchy 
sequence of functions converges (almost uniformly!). From this conver- 
gence, one can immediately see how to extend the integral “by continu- 
ity” from step maps to the most general class of mappings which is 
desired. In the basic lemma, positivity plays no role whatsoever. A 
posteriori, one notices that the monotone convergence theorem and the 
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“Fatou lemma” of other treatments become immediate corollaries of the 
basic approximation lemmas derived from Lemma 3.1. Thus it turns out 
that it is easier to work immediately with complex valued functions than 
to go through the sequence of many other treatments, via positive func- 
tions, real functions, and only then complex functions decomposed into 
real and imaginary parts. The proofs become shorter, more direct, and 
to me much more natural. One also observes that with this approach 
nothing but linearity and completeness in the space of values is used. 
Thus one obtains at once integration with Banach valued functions. But 
readers may well omit considering this case if it makes them more com- 
fortable to deal with C-valued functions only. Note, however, that vector 
space valued functions are useful in giving an especially simple proof for 
the Fubini theorem, which again I find more transparent than the proof 
used in many treatments, based on positivity. Historically, Bochner was 
the first to consider integration of Banach valued functions. From the 
point of view taken here, there is no difference between Banach or com- 
plex valued functions. 

Actually, it is a reasonable question why one should want to identify 
elements of the completion with functions: why not just work formally 
with Cauchy sequences? One of the basic reasons is that certain proper- 
ties of the formal completion which one wishes to use are obvious if 
elements of this completion are identifiable with functions. For example, 
consider the space L of continuous functions on [0,1]. Let T:L—-L 
be the linear map given by Tf(x) = xf(x). Then T is continuous for the 
L}-norm on this space, whence T extends uniquely to a continuous linear 
map T on the completion. Now it is clear that T is injective on L, and 
one can ask if T: L > L is also injective. If we can identify an element of 
the completion with a function f so that T is again given as multiplica- 
tion by x, then one sees at once that T is injective. Otherwise, one has 
to prove some lemma about L'-Cauchy sequences which amounts to a 
special case of those proved to establish the representation of elements of 
the completion by functions, and which serve in a wide variety of context. 

I would also like to draw the reader’s attention to the approximation 
Theorem 6.3, which gives a key result in line with our general approach: 
to prove something in integration theory, first prove it for a subspace of 
functions for which the result is obvious, then extend by linearity and 
continuity to the largest possible space. 


Vi, §1. MEASURED SPACES, MEASURABLE MAPS, 
AND POSITIVE MEASURES 


Let X be a set (non-empty). By a o-algebra in X we mean a collection 
of subsets .@ having the following properties: 


6-ALG 1. The empty set is in . 
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o-ALG 2. The collection is closed under taking complements (in X) 
and denumerable unions. In other words, if AE @ then 
6A eM, and if {A,} is a sequence of elements of 4, then 


is also an element of . 


We conclude at once from these conditions that the whole set X is in 
M, and that a denumerable intersection of elements of .Z is also in 4 
Also, using empty sets, we see that finite unions or intersections of ele- 
ments of .@ are also in .4@ and we could just as well have assumed this 
by saying “countable” instead of “denumerable” in our second axiom. 

A set X together with a o-algebra .@ is called a measurable space, and 
the elements of .@ are called its measurable sets. We note that if A, B 
are measurable, and if we denote by A — B the set 


A—-B=ANG,B 


consisting of all elements of A not in B, then A — B is measurable. 
To prove that a collection of subsets is a o-algebra, we shall often use 
the following characterization: 


A collection M of subsets of X is a o-algebra if and only if it contains 
the empty set, is closed under taking complements, finite intersection, and 
such that, if {A,} is a sequence of disjoint elements of M@ then the union 
|) A, is in MH. 


Proof. This is clear since we can write 
U A, = A, U(A, — A,)U(A3 — (A, U Ay))U. 


We could also define the notion of an algebra of subsets of X. It is a 
collection .~ of subsets satisfying the following conditions: 


ALG 1. The empty set is in x. 
ALG 2. If A, Be &, then ANB, AUB, and A — Bare in «&. 


Thus we can say that a o-algebra is an algebra which is closed under 
taking countable unions, and containing the set X itself. 


Terminology. In some texts, what we call an algebra is called a ring 
(of subsets). However, in the theory of algebraic structures (groups, rings, 
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fields, vector spaces, etc.) it has become more or less standard practice 
to assume that a ring has a unit element for multiplication, while an 
“algebra” is merely an additive group with a bilinear law of composition. 
Our definitions have therefore been made to fit these conventions, in the 
analogous situation of algebras of subsets. Here, of course, the “unit 
element” is the whole space. 


Let ¥ be a collection of subsets of X. Then there exists a smallest 
o-algebra in X which contains ¥. 


Proof. We can take for .@ the intersection of all o-algebras containing 
SY. The collection of all subsets of X is such an algebra, and does 
contain Y, so that we are not faced with the empty set. It is immediate 
that the intersection .@ above is itself a o-algebra, so we are done. 


In the preceding result, the o-algebra .Z@ is said to be generated by & 


Example 1. 

Let X be a topological space, and let Y be the collection of all open 
sets. The o-algebra generated by these open sets is called the algebra of 
Borel sets. An element of this algebra is called Borel measurable. In 
particular, every denumerable intersection of open sets and every de- 
numerable union of closed sets is Borel measurable. 


Example 2. 

Let (X,-@) be a measurable space. Let f: X + Y be a mapping of X 
into some set Y. Let W be the collection of subsets S of Y such that 
f~'(S) is measurable in X. Then W is a o-algebra. The proof for this is 
immediate from basic properties of inverse images of sets. We call WV the 
direct image of .Z under f, and could denote it by f,(7Z). (Cf. Exercise 


1) 


Example 3. 

Let X be a measurable space, and let Y be a subset. If .@ is the 
collection of measurable sets of X, we let .@, consist of all subsets 4-4 Y, 
where Ae€.4. Then it is clear that .Z, is a o-algebra, which is said to be 
induced by .@ on Y. Then (Y,.@,) is a measurable space. 


Measurable Maps 


If (X, .@) and (Y,./) are measurable spaces, and f: X > Y is a map, we 
define f to be measurable if for every Be WV the set f~'(B) is in &@ By 
condition M2 below, one sees at once that if Y is a topological space, 
and WV is the o-algebra of Borel sets, then f is measurable in this general 
sense if and only if it satisfies the seemingly weaker condition stated in . 
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M2, namely that the inverse image of an open set is measurable. In 
practice, we deal only with maps into topological spaces, and in fact into 
normed vector spaces. 


MI. If f:X- Y is measurable, and g: YZ is measurable, then the 
composite go f is measurable. This is clear. 


M2. Let f:X—Y be a map into a topological space, with the o- 
algebra of Borel sets. Suppose that for every open V in Y, the 
inverse image f~'(V) is measurable. Then f is measurable. 


Proof. Let W be the collection of subsets S of Y such that f~(S) is 
measurable in X. Then YW is a o-algebra and contains the open sets. 
Hence it contains all Borel sets in Y, thus proving the desired result. 


From now on, our maps will have values in a topological space, with 
the Borel sets as measurable sets. 

We note at once that taking complements, we could have defined 
measurability by the condition that the inverse image of a closed set is 
measurable. Furthermore, we see that the inverse image of a countable 
union of closed sets, and the inverse image of a countable intersection of 
open sets is measurable because if {U,} is a sequence of open sets, then 


(A u,) - () f7(U,) 


and similarly for closed sets. Example: Let J be a half-open interval 
(a, b] and let f: X > R be measurable. Then 


f“(@ 5]) 


is measurable because we can write (a, b] as the union of closed intervals 


1 
fasts for n=1,2,.... 


We shall now give a large number of criteria for mappings and sets to 
be measurable, and we shall see that limit operations preserve measur- 
ability, and algebraic operations likewise, under extremely mild hypo- 
theses on the image space Y. These hypotheses will always be satisfied in 
practice, and trivially so in the case when we deal with maps into the 
real or complex numbers, or into Euclidean n-space. 


M3. Let f:X —Y x Z be a map of a measurable space X into a 
product of topological spaces Y, Z. Write f in terms of its coordi- 
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nate maps, f =(g, h) where g:X ~ Y and h:X > Z. If f is mea- 
surable, then so are g and h. Conversely, if g, h are measurable, 
and every open set in Y x Z is a countable union of open sets 
V x W, where V is open in Y and W is open in Z, then f is 
measurable. 


Proof. If f is measurable, then composing f with the projections of 
Y x Z on Y or Z shows that both g and h are measurable. Conversely, 
if g, h are measurable, then for any open sets V, W in Y, Z respectively, 
we have 
{TCV x Wy=g°(V)oh"(W). 


Hence f~'(V x W) is measurable. The measurability of f~'(U) for any 
open set U now follows from the assumption made on the topology of 
Y x Z. 


Mé4. In particular, we conclude that a complex function f on X is 
measurable if and only if its real part and imaginary part are 
measurable. 


Note that the condition expressed on the product space Y x Z in our 
criterion is satisfied if Y, Z are metric spaces and have denumerable 
everywhere dense sets. Thus they are satisfied if Y, Z are separable 
Banach spaces, and in particular for euclidean n-space. Actually, in most 
applications we integrate complex valued functions, so that there is no 
problem with this extra condition. 


M5. If f is a measurable map of X into a normed vector space, then 
the absolute value |f| is measurable, being composed of f and the 
continuous function y+>|y\. 


We would like the sum of two measurable maps f, g into a normed 
vector space E to be measurable. Since the sum can be viewed as the 
composite of the map x+>+(f(x), g(x)) and the sum map E x EE, 
which is continuous, what we want follows from our criterion concerning 
maps into a product space, provided the extra condition is satisfied. In 
particular, we obtain the following. 


M6. Measurable complex valued functions on X form a vector space, 
and similarly if the values are in a finite dimensional space, or if 
we restrict ourselves to maps whose image is separable (i.e. contains 
a countable dense set). Similarly, if f, g are measurable complex 
functions on X, then the product fg is measurable. 


For this last assertion, we note that the product is composed of the 
map (f, g) and the product C x C >C, which is continuous. 
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M7. Let f: X — Y be a mapping of X into a metric space. Let { f,' be 
a sequence of measurable mappings of X into Y which converges 
pointwise to f. Then f is measurable. 


Proof. Let U be open in Y. If xe f~'(U), then for all k sufficiently 
large, we must have xe f;,‘(U) because f,(x) converges to f(x). Hence 
for each m, 


fe YU FW) 
and consequently 


pW) (\ U fet) 


On the other hand, let A be a closed set. Suppose that x lies in every 
union 


Ute 


for all positive integers m. Then for arbitrarily large k, we see that /,(x) 
lies in A, and hence by assumption the limit f(x) lies in A because A is 
closed. Hence we obtain the reverse inclusion 


OU mmr 


i 


Let V be a fixed open set. For each positive integer n let A, be the 
closed set of all ye Y such that d(y,@V) = 1/n, and let V, be the open 
set of all ye Y such that d(y, @V) > 1/n. Then 


and 


e 


Thus we have the inclusions 


{TV = U f-"(A,) > 


=C_ 
8 
C8 
~ 
> 


and 


FV) =U FAME OU fe 


This proves that the equality holds, and shows that f~'(V) is measurable. 
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This last result is really the main thing we were after. We need it 
immediately in the next section to know that if f is a limit of measurable 
real valued functions, then for every real a, the set 


f"—/) 


is measurable when J is equal to the interval of all t >a or the interval 
of all t 2a. | 

In the definition and development of the first properties of the integral 
in the subsequent sections, the limit property we have just proved, com- 
bined with our definition, is the one which will be most useful. It turns 
out that there is a condition which is necessary and sufficient for a map 
to be measurable in all applications, but which we preferred to postpone 
and state as a criterion rather than take as definition. We now discuss 
this condition. It will be the useful one in dealing with further properties. 

A map f: X > Z into any set Z is said to be a simple map if it takes 
on only a finite number of values, and if, for each v € Z the inverse image 
f~*(v) is measurable. Thus X can be written as a finite disjoint union, 


where each X; is measurable, and f is constant on X;. 

It is clear that simple maps of X into a Banach space E form them- 
selves a vector space. 

If {g,} is a sequence of simple maps of X into a Banach space E, and 
{o,} converges pointwise, then the limit is measurable, according to the 
criterion M7. The converse is almost true, and is indeed true when E 1s 
finite dimensional (so in particular when E represents the real or complex 
numbers). We have: 


M8. A map f:X -E of X into a finite dimensional space is measur- 
able if and only if it is a pointwise limit of simple maps. 


‘Proof. The result reduces immediately to the case when E=R. We 
leave the reduction to the reader. Thus assume that f is measurable real 
valued. For each integer n > 1 cut up the interval [—n, n] into intervals 
of equal length 1/n and denote these intervals by J,, ...,Jy. We take 
each J, to be closed on the left and open on the right. We let Jy, 
consist of all t such that |t] =n. Let 


A, =f 1(j) for k=1,....N+1 


so that each A, is measurable, the sets A, (k = 1,...,N + 1) are disjoint, 
and their union is X. On each A, we define a constant map y, by 


W,AA,) =infy f if k=1,...,N. 
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We can write Ay,, = BU B’ where B consists of those elements x such 
that f(x) 2 n and B’ consists of those x such that f(x) < —n. We define 


¥,(B)=n and y,(B’) = —n. 


Then the sequence {y,} converges pointwise to f, and each y, is a simple 
function. This proves that measurability implies the other condition. The 
converse is already known from M7, and thus our characterization of 
measurable maps is proved. 

The construction of the case we just discussed yields a useful addi- 
tional property in the positive case: 


M9. Let f: X >Rs, be a positive real valued measurable map. Then 
f is a pointwise limit of an increasing sequence of simple maps. 


Proof. The functions wy, defined above are all < f, and we let 


OP, = max(, gene Wr). 


Then {g,} is increasing to f, as desired. 


After discussing positive measures, we shall discuss a variant of condi- 
tion M8, related to a given measure. 


Positive Measures 


We shall now define positive measures. To do this, it is convenient to 
introduce the symbol oo in the context of positivity (after all, we want 
some sets to have infinite measure). 

We let o be a symbol unequal to any real number. By [0, 00] (which 
we call also an interval) we mean all t which are real 2 O or co. We 
introduce the obvious ordering in [0, 00], with a < oo for every real a. 
We define addition and multiplication in [0, oo] by the convention that 


oO:a=a'o=0 ifa = 0, 
co:'ad=a:'o = fO<aso, 
ota=at+o=O fO<a<o. 


Then associativity, distributivity, and commtativity hold in [0, oo]. The 
sum of a sequence of elements in [0, co] then can be viewed to converge 
to a number 2 0 or to o. 

Let X be a measurable space and let .@ be the collection of its 
measurable sets. A positive measure on .@ (or on X, by abuse of lan- 
guage) is a map 

uu: UM —+[0, oo] 
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which is countably additive. In other words u(@)=0, and if {A,} is 
a sequence of measurable sets which are mutually disjoint (A,04A,, 1S 
empty if n 4 m), then | 


u( U 4,) = y ul,). 


If A is measurable, we call (A) its measure, or p-measure if the reference 
to is necessary to avoid confusion. 


Examples. Let X be a set and x, an element of X. If A is a subset of 
X containing x,, we define u(A) = 1. If A does not contain x9 we define 
u(A) = 0. It is immediately verified that this defines a measure, called the 
Dirac measure at Xo. 

As another example, if a subset is finite, we define its measure to be its 
number of elements, and if a subset is infinite, we define its measure to 
be oo. Again it is immediately verified that this defines a measure, called 
the counting measure. 


We shall identify measures with integrals later. 


A measurable space together with a measure is called a measured 
space. When we want to specify all data in the notation, we write the 
full triple (X, 4, y) for a measured space. 

We derive some trivial consequences from the definition of a positive 
measure. 

First we note that the additivity of uw holds for finite sequences since 
we can take all but a finite number of the A, to be empty. 

Next, a measure satisfies properties of monotonicity, namely: 


If A, B are measurable, A < B, then p(A) S p(B). 
This is obvious because we can write B = A U(B — A). 


Proposition 1.1. If {A,} is a sequence of measurable sets and A, < Ay+, 
for all n, and if 


then 
u(A) = lim y(A,). 


(This is understood in the obvious sense if u(A) = 00.) To prove this, we 
let Ay be the empty set, write 


A= A, U(A, — A,)U(A3 — An) UU (Ans — An) UU, 
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and use the countable additivity. We get 


N 
H(A) = lim X M(Ans1 — A,) = lim (Ay), 


as was to be shown. 


It will occasionally be useful to have the following characterization of 
measures: 


Proposition 1.2. A map p:.@-—-[0, co] is a measure if and only if 
u(@) = 0, p is finitely additive, and if {A,} is an increasing sequence of 
measurable sets whose union is A, then 


lim p(A,) = H(A). 
Our assertion is obvious, taking into account our preceding arguments. 


Proposition 1.3. If A, is a decreasing sequence of measurable sets, i.e. 
A,+1 < A, for all n, if some A,, has finite measure, and if 


then 
u(A) = lim p(A,). 


To prove this, say u(A,) 4 oo. We write 
A, = (A, ~ A,)U A,. 


The sets A, — A, form an ascending sequence, whose union is A, — A. 
By our previous result, we conclude that 


u(A,) = lim p(A, — A,) + lim p(A,) 


noo noo 


= (A, — A) + lim p(A,) 


no 


= u(A,) — (A) + lim p{A,). 


n->o 


Our assertion follows. 

Note that if we do not assume that some A, has finite measure, then 
the conclusion may be false. Indeed if all A, have infinite measure, their 
intersection may be empty. Think of the real numbers 2 n. 
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If A, is an arbitrary sequence of measurable sets, then in general we 


have only 
( 
This is again obvious. 

Having the notion of (positive) measure on .@ we emphasize the role 
played by sets of measure 0, and we shall use the following terminology. 
A property of elements of X is said to hold almost everywhere, or for 
almost all x, if there exists a set S of measure 0 such that the property 
holds for all x ¢ S. For instance, if f: xX +~R is a map of X into the 
reals, we say that f =O almost everywhere if f(x) 2 0 for almost all x, 
i. for all x outside a set of measure 0. Of course, we should really put 
the uw into the notation, and say p-almost everywhere or p-almost all, but 
since we deal with a fixed measure, we omit the prefix y- for simplicity. 

In developing the theory of the integral, we follow the oldest idea, 
which is first to integrate step maps and then take limits. We shall now 
discuss the measure theoretic aspect of this procedure. 

Let A be a set of finite measure. By a partition of A we mean a finite 
sequence {A;} (i = 1,...,r) of measurable sets which are disjoint and such 
that 


4,] SY HlA,) 


em: 


Let E be a Banach space. A map f: X > E is called a step map with 
respect to such a partition if f is equal to 0 outside A (that is f(x) = 0 if 
x ¢ A), and f(A;) has one element for each i (i.e. f is constant on A;). A 
map f: X —E is said to be a step map if it is step with respect to some 
partition of some set of finite measure. We denote the set of all step 
maps by St(yu, E) or more briefly by St(y). 

If Y is a measurable subset of X, then the restriction to Y of a step 
map on X is a step map on Y. Conversely, a step map on Y can be 
extended to a step map on X by giving it value 0 outside Y. If f is a 
map on X, we denote by f, the map such that f;(x)=0 if xe Y and 


fy(x) = f(x) if xe Y. 
The set of step maps St(p, E) is a vector space. If f is a step map, then 
so is |f|. If f: X +E is a step map and g: X >C is a step function, 
then gf (also written fg) is a step map. 


Proof. This is proved trivially using a refinement of two partitions. 
Indeed, if {A;} and {B,} are two partitions of A, then 


{A; 7 B} 
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is also a partition. Also, if f is 0 outside A, and g is 0 outside B, and 4A, 
B are measurable of finite measure, then AWB has finite measure, and 
we can find a partition of AUB with respect to which both f and g are 
step maps. From this our assertions are obvious. 


We shall not use the rest of this section until the corollaries of the 
dominated convergence theorem in §5. 


We shall define the integral on certain maps which are limits of step 
maps. The present discussion is devoted to such limits. We define a map 
to be y-measurable if it is a pointwise limit of a sequence of step maps 
almost everywhere. In other words, if there exists a set Z of measure 0 
and a sequence of step maps {g,} such that {@,(x)} converges to f(x) for 
all x€é Z. Let f: X ~ Y be p-measurable, and let Ac X and Bc Y be 
measurable subsets with f(A) < B. Then the induced map f:A-B is 
u-measurable. Instead of M1, we have: if f: X > E is u-measurable, and 
g: E— F is continuous, then g o f is y-measurable. 


M10. The p-measurable maps of X into E form a vector space. If f, g 
are p-measurable functions (complex), so is their product. In fact, 
if f:X —-E and g:X —F are pu-measurable maps into Banach 
spaces, and Ex F>G is a continuous bilinear map, then the 
product fg (with respect to this map) is p-measurable. The abso- 
lute value |f| is p-measurable. If f is a pu-measurable function 
such that f(x) #0 for all x, then 1/f is u-measurable. 


Proof. All statements are clear, except possibly the last, for which we 
give the argument: If {g,} is a sequence of step functions converging 
pointwise to f, then we let w,(x) = 1/9,(x) if 9,(x) #0 and y,(x) = 0 if 
~,(x) = 0. Then w, is step, and the sequence {w,} converges pointwise to 


I/f. 


The property of p-measurability builds in some very strong finiteness 
properties on both the set of departure and the set of arrival of the map. 
To begin with, it is clear that a p-measurable map vanishes outside a 
countable union of sets of finite measure. Such sets are important. We 
give a name to them, and say that a measurable subset Y of X is o-finite 
if it is a countable union of sets of finite measure. More accurately, we 
should really say that p is o-finite on Y, and we should say that w is 
o-finite if it is o-finite on X. However, we allow ourselves the other 
terminology when uy is fixed throughout a discussion. 

Secondly, there exists a set Z of measure 0 such that the image 
f(X — Z) of the complement of Z contains a countable dense set (1. is 
separable). This is clear since outside such Z the map f is a pointwise 
limit of step maps, and thus the image of X — Z lies in the closure of a 
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set which is a countable union of finite sets. Thus we now have two 
necessary conditions for a measurable map to be yu-measurable, namely 
countability conditions on its domain and range. It turns out that these 
are sufficient. 


M11. Let f: X — E be a map of X into a Banach space. The following 

two conditions are equivalent: 

(i) There exists a set Z of measure 0 such that the restriction of 
f to the complement of Z is measurable, f vanishes outside 
a oa-finite subset of X, and the image f(X — Z) contains a 
countable dense set. 

(ii) The map f is a pointwise limit almost everywhere of a Se- 
quence of step maps (that is, f is u-measurable). 

In particular, if w is o-finite and if f is a function (complex 

valued), then f is u-measurable if and only if there exists a subset 

Z of measure 0 such that f is measurable on the complement of 

Z. 


Proof. We have already proved that (ii) implies (i), using our preced- 
ing remarks, and M7. Conversely, assume (i). We may assume that X is 
a disjoint union of subsets X, (k = 1, 2,...) of finite measure. If we can 
prove that the restriction f|X, of f to each X, is u-measurable, then for 
each k there is a sequence {py} (j = 1, 2,...) of step maps on X, which 
converges almost everywhere to f|X,. We define o, by the following 
values : 


go, iso on X, for k=1,...,n, 


~, (x) = 0 if x€éX,U°:UX,. 


Then each g, is a step map, and the sequence {g,} converges almost 
everywhere to f. This reduces the proof that f is y-measurable to the 
case when X has finite measure. 

Suppose therefore that X has finite measure. We may also assume 
that the image of f contains a countable dense set {v,} (k = 1, 2,...). 
For each positive integer n, let B,,,(v,) be the open ball of radius 1/n 
centered at v,. The union of these balls for all k= 1, 2, ... covers the 
image of f, whence the union of the inverse images under f covers X 
itself. If we take k large, it follows that the finite union of inverse images 


f7'(Byn(01)) Ue Of (Bia) = X — Y, 
differs from X by a set Y, such that u(Y,) < 1/2". We let 


LZ, = YU Yae1 Uo 
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so that u(Z,) <1/2""*. Then Z, > Z,,, >::: is a decreasing sequence. 
On X — Y, we can obviously find a step map 9, such that 


If) — Pax) <1/n for x¢¥,. 


We simply define the map g, inductively to have the value v, on the 
inverse image of B,/,(v,), the value v, on the inverse image of 


By _(V2) _ Bip (V, ), 


and so forth. We let wy, be equal to g, on X — Z, and give y, the value 
0 on Z,. Then y, is a step map, and the sequence {wW,} converges 
pointwise to f, except possibly on the set Z equal to the intersection of 
all Z,, which has measure 0. This proves what we wanted. 


Remark 1. The proof is substantially the same as that of M8, granting 
the necessary adjustment to the more general situation. 


Remark 2. We get some uniformity of convergence from the proof, 
outside a set of arbitrarily small measure. 


Remark 3. We took values of f in a Banach space, but for purposes 
of M11, values in any complete metric space would have done just as 
well. The additive structure plays no role. However, in all subsequent 
applications, we deal with maps in vector spaces where the additive 
structure does play a role. 


Remark 4. Let .@ be the o-algebra of all subsets of the set X. Let 
f:X —E be an arbitrary map into-a Banach space. Then f is measur- 
able, and u-measurable if yw is such that uw(Y) = 0 for all subsets Y of X. 
This shows that it is reasonable to exclude the behavior on a set of 
measure 0 in our definition of y-measurability. 


M12. Let {f,} be a sequence of u-measurable maps, converging almost 
everywhere to a map f. Then f is u-measurable. 


Proof. This is clear by using (i) of M11, and the following facts: A 
denumerable union of sets of measure 0 has measure 0. A denumerable 
union of sets having countable dense subsets has a countable dense 
subset. [If {D,} is a sequence of denumerable sets in a metric space, then 


C 8 


D,>D, for all n, whence |) D,> \) D,, 
k=1 n=1 


k=1 = 
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so that 


and our second statement is clear also. | 

Property M12 concludes the list of properties which show that p- 
measurability is preserved under the standard operations of analysis, with 
the sole exception of composition of maps, contrary to M1. 


For the rest of this chapter, we let (X, 4, 1) be a measured space, i.e. 
M is a o-algebra in X, and p is a positive measure on .4 We let E be a 
Banach space. At first reading, the reader may assume that all maps f are 
complex or real valued, that is E=C or R. No proof or notation would 
be made shorter by this assumption. 


Vi, §2. THE INTEGRAL OF STEP MAPS 


If A is a measurable set of finite measure, and f is a step map with 
respect to a partition {A,;} (i= 1,...,r) of A, then we define its integral to 
be 


[ f du = Y wA)s(Ad 


If {B+ (j =1,...,s) is another partition of A, then f is step with respect 
to the partition {A; 7 B;} and we have 


Y, WA. B)ftAi) = HA) f(Ad 


Summing over i shows that our integral does not depend on the partition 
of A. If f is step with respect to a partition of a set A and a set B, then 
it is also step with respect to a partition of AUB, and we see that our 
integral is therefore well defined. 

If A is an arbitrary measurable subset and f is a step map on X, 
recall that f, is the map such that f,(x) = f(x) if xe A and f,(x) =0 if 
x ¢ A. Then f, is a step map both on A and on X, and we define 


| fa = | 4 du. 


If u remains fixed throughout a discussion, we write 


| f instead of | f du, 
xX X 


[ VI, §2] THE INTEGRAL OF STEP MAPS 127 


and even omit the X if the total space X is fixed, so that we also write 


| f instead of | f. 


If we integrate over a subset of X, then we shall always specify this 
subset, however. We now have trivial properties of the integral. 
First, the integral is obviously a linear map 


[: St(u, E) > E 


which satisfies the following properties. 


If A, B are disjoint, then 


(1) | r= | r+] os 
AUB A B 


This is clear from the linearity, and the fact that f,,., = f, + Se- 


Over the reals, the integral is an increasing function of its variables. 
This means: If E=R and f <Q, then 


(2) {7 = fo 


Furthermore, if f 20 and A c B, then 


(3) | fs | 4 


Property 2 can be obtained from its positive alternate, namely 


(2P) If f 20, then [re 0. 


Indeed, we just use linearity on g — f. 


Finally, the integral satisfies the inequalities 


(4) | fdu 


<| fl du S | flucA), 


128 THE GENERAL INTEGRAL [VI, §3] 


where || || is the sup norm. This is an obvious estimate on a finite sum 
expressing the integral. 
We can define a seminorm on the space of step maps, by letting 


If lla = [. f\du= [irk 


That this is a seminorm is immediately verified. For instance, to show 
that 


If+gli Sift + ligt, 


we take a partition of a set of finite measure such that both f and g are 
step maps with respect to this partition, and then we estimate using the 
triangle inequality. This seminorm will be called the L’-seminorm. 


Note. The results of this section are at the level of a first course in 
calculus. We don’t take limits, and our results depend only on the 
presence of an algebra (not necessarily a o-algebra) and a map yp of this 
algebra into the reals = 0 which is additive, i.e. 


u(A U B) = WA) + W(B) 


for A, B disjoint in the algebra. 


Vi, §3. THE L'-COMPLETION 


We wish to investigate the completion of our space of step maps with 
respect to the L'-seminorm. We recall that the completion is defined to 
be the space of equivalence classes of L’-Cauchy sequences, and that two 
Cauchy sequences are said to be equivalent if their difference is an L’- 
null sequence. We denote the completion by L'(y). We recall that the 
L‘-seminorm extends by continuity to a norm on this completion. We 
have a linear map 


St(u) > L*(u) 


whose kernel is the subspace of step maps whose L‘-norm is 0. We shall 
describe this kernel in a more general situation later. 

We want to determine a certain space of functions corresponding as 
closely as possible to the elements of L'(y). If every L’-Cauchy sequence 
were also pointwise convergent, there would be no problem. This is 
however not the case, but the situation is close enough to this so that we 
can almost think in these terms. 

We define #'() to be the set of mappings such that there exists an 
L'-Cauchy sequence of step mappings converging almost everywhere to 
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f. If {f,} and {g,} are L'-Cauchy sequences of step mappings converg- 
ing almost everywhere to f and g respectively, then {f,+g,} and {af} 
(for any number «) are L'-Cauchy and converge almost everywhere to 
f +g and af respectively. Consequently #*(y) is a vector space. 

In this section and the next, we speak of Cauchy sequences instead of 
L}-Cauchy sequences since this is the only seminorm which will enter 
into considerations. Since we have several notions of convergence, how- 
ever, we still specify by an adjective the type of convergence meant in 
each case. Actually, it will be useful to say that a sequence { f,} approxi- 
mates and element f of #' if {f,} is L'-Cauchy and converges to f 
almost everywhere. 

We shall extend the integral to Y', and we need two lemmas, which 
show that our approximation technique is not far removed from uniform 
approximation. The first is the fundamental lemma of integration. 


Lemma 3.1. Let {f,} be a Cauchy sequence of step mappings. Then 
there exists a subsequence which converges pointwise almost everywhere, 
and satisfies the additional property: given & there exists a set Z of 
measure < & such that this subsequence converges absolutely and uni- 
formly outside Z. 


Proof. For each integer k there exists N, such that if m, n 2 N,, then 


1 


We let our subsequence be g, = fy,, taking the N, inductively to be 
strictly increasing. Then we have for all m, n: 


1 , 
Im — Only < 530° if m2Zn. 
We shall show that the series 
g(x) + » (gx41(X) — g(x) 


converges absolutely for almost all x to an element of E, and in fact we 
shall prove that this convergence is uniform except on a set of arbitrarily 
small measure. 

Let Y, be the set of x € X such that 


1 
1Gn+1(X) _ Jn(x)| = an 
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Since g, and g,,, are step mappings, it follows that Y, has finite measure. 
On Y,, we have the inequality 


5S ldusi — 9 
an = '9n n 
whence 
wi (Y) = \. a < [ IInt1 — Jnl S om 
Hence 
WY) <= 
n) = 5n 
Let 
Zn = YnU Mra Uo 
Then 
u(Z,) S ani" 


If x € Z,, then for k 2 n we have 


1 
lDu+1(X) — 94(x)| < 5e 
and from this we conclude that our series 


> (Guar (x) —_ g(x) 


is absolutely and uniformly convergent, for x ¢ Z,. This proves the state- 
ment concerning the uniform convergence. If we let Z be the intersection 
of all Z,, then Z has measure 0, and if x € Z, then x € Z, for some n, 
whence our series converges for this x. This proves the lemma. 


Lemma 3.2. Let {g,} and {h,} be Cauchy sequences of step mappings 
of X into E, converging almost everywhere to the same map. Then the 
following limits exist and are equal: 


lim | 9, = im | h,. 
X X 


Furthermore, the Cauchy sequences {g,} and {h,} are equivalent, i.e. 
{g, —h,} is an L*-null sequence. 
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Proof. The existence of the limit of each integral is of course a trivial- 
ity. To see the argument once more, we have 


[ufo 


so that {{g,} is a Cauchy sequence, whence converges. Let f, = 9, — Mn. 
Then {f,} is Cauchy, converges almost everywhere to 0, and we must 


prove that the integrals 
[rom fin 
X xX 
converge to 0. 


Given «, there exists N such that if m, n = N we have 


= {las — on = Gn — mila; 


Il tn — Smila < &. 


Let A be a set of finite measure outside of which fy vanishes. Then for 
all n = N we have 


\.. Jnl = I. In — Jnl S [ [Jn — Jnl < 26. 


By Lemma 3.1, there exists a subset Z of A such that 


€ 
MA) < TTA 


and a subsequence of n such that {f,} tends to 0 uniformly on A — Z. 
Then for n large in this subsequence, we conclude that 


| ini<e 


Finally for n large in this subsequence we have 


| [Int S | [tn — Sn + | fn! 


SW fn — Sulla + H(Z) II fy ll < 22. 


Taking the sum of our integrals over @A, A—Z, and Z we find the 
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desired bound, 
Il fnlla = | lfnl < 58. 
x 
This proves the lemma. 


In view of Lemma 3.2, for every f in #' we can define the integral 


[ tau= | fim | 4 du 


using any approximating sequence of step maps {f,} to f. Elements of 
L' will therefore be called integrable maps. It is clear that the integral is 
a linear map of #' into E. 


We want to extend the seminorm || ||, to @’. We need a lemma for 
this. 


Lemma 3.3. If f is integrable and {f,} is an approximating sequence 


of step maps, then |f| is integrable, and {|f,|} approximates |f|. In 
particular, 


[ [f| = lim [ [fal = lim || fills. 


Proof. It is clear that | f,| converges to |f| almost everywhere, so that 
|f| is integrable. To see that {|f,|} is a Cauchy sequence, we note that 


II ful — | Soul | SS Ltn — Se 


whence 


ful — (frnl I. = [ Il ful — [fl | S [ fn — Smt = Mtn — Silla - 


This proves the lemma. 


Lemma 3.3 implies in particular that 


tim I fall 


is independent of the choice of approximating sequence {f,} to f, and 
thus allows us to define 


fll = [ [f| = lim | falls. 


By continuity, this is trivially verified to be a seminorm on #?. 
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Let us summarize what we have done. Our purpose was to construct 
a completion (essentially) of the space of step mappings, under the L'- 
seminorm. In any case, we have constructed a space #* on which we 
have extended the integral and the seminorm by continuity. We must 
still show that this space is complete. We could now either relate our 
L' with the space of equivalence classes of Cauchy sequences, and use 
the result of Chapter 4, §4, that this latter space is complete, or repro- 
duce independently the proof of that result in the present instance. For 
convenience, we do this. 


Theorem 3.4. The space £' is complete, under the seminorm || |\,. 


Proof. Let {f,} be a Cauchy sequence in #*. For each n there exists 
an element g, € St() such that 


fn — nila < A/n. 


The sequence {g,} is then Cauchy. Indeed, we have 


Dn — Gm = lI Dn — frills + ll tn — fills + ll fm — Imll1> 


which gives a 3e-proof of the fact that {g,} is a Cauchy sequence. For a 
subsequence of n, we know by Lemma 3.1 that {g,} converges almost 
everywhere to a function f in Y'. For this subsequence, we then have 


fn — Fla SW In — Gala + WGn — Fla 


and this is < 2e for n sufficiently large in the subsequence. Hence the 
subsequence is L'-convergent to f. It follows that the sequence { f,} itself 
is L'-convergent to f, and concludes the proof. 


Note that the statement of Theorem 3.4 is to be interpreted in the 
sense that given a Cauchy sequence {f,} of elements in #', there exists 
some f in #' such that given ¢, we have || f, — f ||, <« for n sufficiently 
large. We still have the possibility that the seminorm || ||, 1s not a 
norm, so that strictly speaking, “the” completion in the sense of Chapter 
IV, §4, would be the factor space of #' by the subspace of all elements f 
such that || f ||, = 0. 

Let us now take for granted the existence of a completion as the 
space of equivalence classes of Cauchy sequences of step maps, modulo 
null sequences. Denote this by L*(y). Then we can define a map 


y: L*(u) > L*(u) 


which to each integrable fe Y' associates the equivalence class of a 
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Cauchy sequence {f,} approximating f. Lemma 3.2 shows that this map 
is well defined, and it is obviously linear. The definition of the seminorm 
on £' means that in this notation, we have 


If lla = vl. 


Similarly, the integral, which is a continuous linear map 


| du: St(u, E) > E 
X 


for the L!-seminorm of St(u), extends in a natural way to L*(y). What 
we have shown in Lemma 3.2 is that there is a way of lifting it to #* in 
such a way that for fe Y' we have 


{s = [. y(f). 


The continuity of the integral with respect to our L’-seminorm is implied 


by the relation 
| |= | fl = WF hs. 
X X 


This relation is true for step maps f, and consequently holds for the 
extension of our continuous linear map to the completion. Therefore, it 
holds also for elements of Y' by Lemma 3.3 and the definition of the 
seminorm || ||, on Y*. The preceding relation also shows that the inte- 
gral has norm < 1, as a linear map. 


VI, §4. PROPERTIES OF THE INTEGRAL: FIRST PART 


We note that if fe #' and g differs from f only on a set of measure 0, 
then g lies in ¥', and the integrals of f, g coincide, as well as their 
L}-seminorms. 

We also note that if fe #1, we can always redefine f on a set of 
measure O, say by giving it constant value on such a set, so that our new 
map is measurable. Indeed, if {@,} is a sequence of step maps converging 
to f except on some set Z of measure 0, we let w, be the same map as 9Q, 
outside Z, and define w,(x) =0, say, for xe Z. Then y, is measurable, 
and the sequence {w,} converges everywhere to a map g which is equal 
to f except on Z. Furthermore, g is measurable, by M7. 

The properties of the integral which we obtained for step maps now 
extend to the integral of elements of #'. We shall go through these 
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properties systematically once more. We start by repeating that 


| du: L'(u, E)>E 
xX 
is linear. 


We observe that if f, g are in #1(y) then |f|, |g| are in Y'(u, R), and 
consequently if E = R, then 


sup(f, g) = a(f+9+\f — al) 


is in Y', and so is inf(f, g) for a similar reason, namely 


inf(f, g) = 3(f + 9 —\f —g)). 


The expression for the sup also shows that if {f,}, {g,} are sequences 
in L+(u,R) which are L'-convergent to functions f, g respectively, then 


sup(f,,9,) is L'-convergent to sup(f, g). 
If f is a real function, then we can write 


f=f'-f 
where f* = sup(f,0) and f~ = —inf(f, 0). It follows that f is in #' if 
and only if f* and f~ are in Y*. Such a decomposition is occasionally 
useful in dealing with real valued maps. 
For any measurable set A and any f € #'(p) the map f, is also in #'. 
(Recall that f, is the same as f on A, and zero outside A.) Proof: If {9,} 


is a sequence of step maps approximating f, then {g,,} converges almost 
everywhere to f,, and is Cauchy because 


| oa ~ Gaal S | \@n — Pm| = Pn — Om lls - 
X X 


Hence {g,,} approximates f,. From the linearity of the integral, we thus 
obtain: 


If A, B are disjoint measurable sets, then 


: Jato Lt)! 


This follows from the fact that fajg = fy + fp. 
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Over the reals, the integral is an increasing function of its variables. 
This means: if E =R and f < g, then 


(2) |rs |e 


Furthermore, if f 20 and Ac B are measurable, then 


(3) | fs | 4 


Property 2 can be obtained from its positive alternate, namely 


(2P) If f 2 0, then [rz 0. 


This is clear since an approximating sequence of step functions {g,} can 
always be taken such that 9, = 0, replacing 9, by sup(q,, 0) if necessary. 
Property 2 follows by linearity, and Property 3 is then obvious. 


Finally, the integral on L'(p) satisfies the inequalities 


[va 


where || || is the sup norm. (We recall that 0-oo = 0.) This is immediate, 
taking an approximating sequence {@g,} of step maps to f, using continu- 
ity for the first inequality, and (2) for the second. When || f|| or p(A) is 
infinite, the inequality is clear, and when both are finite, we use (2). 

The next properties are general properties, immediate from the conti- 
nuity of the integral. We make the Banach space explicit here. 


(4) 


S | [fl dus || fll H(A) 


Theorem 4.1. Let 4:E-—F be a continuous linear map of Banach 
spaces. Then 4 induces a continuous linear map 


L*(u, E) > L*(y, F) 
by 
friof, 


i| fau= | ho f du. 


This is obvious for step maps, and follows by continuity for #’'. 


and we have 
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Theorem 4.2. Let E, F be Banach spaces. Then we have a toplinear 
isomorphism 


L*(p, E x F)> L*(u, E) x L(y, F). 


If f: X ~E x F is a map, with coordinate maps f =(g,h) in E and F 
respectively, then f € L' if and only if g, h are in L', and then 


[r-(/a) 


The proof is a simple exercise which we leave to the reader. (The 
projection is a continuous linear map on each factor!) It applies in 
particular in R", or in C, and we see that a complex map is in #°* if and 
only if its real and imaginary parts are in #'. Actually, this particular 
case can be seen even more easily, for if we write a complex function 


f=gtih 


where g, h are real, we note that a sequence of complex step functions 
approximates f if and only if its real part approximates g and its imagi- 
nary part approximates h (with our definition of approximation, that is 
L'-Cauchy , and convergence almost everywhere). Thus 


| = |o + ih 
whenever f is in #'(p, C). 


All the properties mentioned up to now are essentially routine, and 
are listed for the sake of completeness. It is natural to make such a list 
involving properties like linearity, monotonicity, sup, inf, behavior under 
linear maps, and product mappings, which are the standard finite opera- 
tions on maps and spaces. 

We now turn to the limiting operations, and list the properties of the 
integral under these operations, giving a large number of criteria for limit 
mappings to be in #'. 


Vi, §5. PROPERTIES OF THE INTEGRAL: SECOND PART 


We first generalize the basic and crucial Lemma 3.1 to arbitrary maps in 
L*. This will be formulated as Theorem 5.2. We need a minor lemma 
to use in the proof, which was automatically satisfied when we dealt with 
step maps. We define a measurable set to be o-finite if it is a countable 
union of sets of finite measure. 
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Lemma 5.1. Let f¢ £'(p) be measurable. Let c>0. Let S, be the set 
of all x € X such that | f(x)| 2c. Then S, has finite measure. Further- 
more, f vanishes outside a o-finite set. 


Proof. Let {g,} be an approximating sequence of step functions to f. 
Taking a subsequence if necessary and using Lemma 3.1, we can assume 
that there exists a set Z of measure < e such that the convergence of 
{o,} is uniform on the complement of Z. Hence for all sufficiently large 
n, we have 


le,(x)|2c/2 if xeS,—Z. 


This proves that S, has finite measure. Taking the values c = 1/k for 
k = 1, 2, ... shows that f vanishes outside a o-finite set. Actually we can 
see this even more easily, since each ~, vanishes outside a set of finite 
measure, and f is the limit almost everywhere of {g,}, whence f vanishes 
outside a countable union of sets of finite measure. 


We see that Lemma 5.1 applies in particular to the characteristic 
function of a measurable set: if it is in #1, then the measure of this set is 
finite. 


Theorem 5.2. Let {f,} be a Cauchy sequence in £&' which is L*- 
convergent to an element f in Y'. Then there exists a subsequence 
which converges to f almost everywhere, and also such that given é, 
there exists a set Z of measure < «& such that the convergence is uniform 
on the complement of Z. 


Proof. Considering f, — f instead of f,, we are reduced to proving 


our theorem in the case f =0. Selecting a subsequence, we may assume 
without loss of generality that we have 


1 
I fully < 525 


Also, changing the f, on a set of measure 0, we can assume that all f, 
are measurable. We proceed as in Lemma 3.1. Let Y, be the set of x 
such that | f,(x)| 2 1/2". Then 

1 1 

sH(%n) S ] Al S| Nfl S sa, 

2 Y, x 2 


whence 


I 
Y,) Sa- 
L( nS Qn 
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Let Z,= Y,UY,4,U°°:. Then u(Z,) < 1/2""'. If xéZ,, then for k2n 
we have 


1 
AO Sse 


whence {f,} converges uniformly to 0 on the complement of Z,. We let 
Z be the intersection of all Z,. Then Z has measure 0, and it is clear 
that {f,} converges pointwise to 0 on Z. This proves our theorem. 


Corollary 5.3. An element f ¢ Y' has seminorm || f ||, = 0 if and only if 
f is equal to 0 almost everywhere. 


Proof. Assume that || f||, = 0. Then the sequence {0, 0,...} converges 
in Y' to f, and by Theorem 5.2, it converges pointwise almost every- 
where to f, so that f is 0 almost everywhere. The converse is obvious. 


Corollary 5.3 is a major result in our theory. We define two maps of 
X into E to be equivalent if they differ only on a set of measure 0. We 
see that the actual completion of the space of step maps under the 
L'-seminorm is the space of equivalence classes of functions in #', under 
the equivalence defined by the property of being equal almost every- 
where. In other words, the kernel of the map 


yi LY(w) > L*(w) 
is the space of maps f which are 0 almost everywhere. 


Corollary 5.4. Let {f,} be a Cauchy sequence in Y* which converges 
almost everywhere to a mapping f. Then f is in Y', and is the L'-limit 


of (Snub: 


Proof. The sequence {f,} is L’-convergent to some ge #', and by 
the theorem, some subsequence converges almost everywhere to g. Since 
this subsequence converges almost everywhere also to f, it follows that 
f =g almost everywhere. This proves our corollary. 


Theorem 5.5 (Monotone Convergence Theorem). Let {f,} be an in- 
creasing (resp. decreasing) sequence of real valued functions in Y' such 


that the integrals 
| In Ap 
x 


are bounded. Then {f,,} is Cauchy, and is both L* and almost every- 
where convergent to some function f ¢ &’. 
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Proof. Suppose that we deal with the increasing case. Let 


1 = sup | Suc 
k Jx 


Then for n = m we have 


Sn — final a = [Js = = |f- | Sa- i 


whence we see that the sequence of functions is Cauchy. By Theorem 5.2 
a subsequence converges almost everywhere, and since the sequence { f,} 
is increasing, it follows that {f,} itself converges almost everywhere. That 
convergence is in L'-seminorm by Corollary 5.4. This proves our asser- 
tion in the increasing case, and the decreasing case is similar, or follows 
by considering the sequence {—/f,}. 


Corollary 5.6. If {f,} is a sequence of real valued functions in #', 
and if there exists a real-valued function ge £' such that g 20 and 
\f,| <g for all n, then sup f,, and inf f, are in £', and 


sup [J < | su J and [ints < int | f 


Proof. The functions 


sup(/,, , dn) 


are in ¥', and form an increasing sequence bounded by g. Hence they 
converge almost everywhere and we can apply the theorem to conclude 
the proof for the sup. The inf is dealt with similarly. 


For the next corollary, we recall a definition. Let { f,} be a sequence 
of real valued functions = 0. If 


lim inf f, 


ko nZ2k 


exists, we call it the lim inf of the sequence {f,}. It is clear that if {f,} 
converges pointwise, then its lim inf exists and is equal to the limit. 
Actually, in the next corollary, 


lim inf f,(x) 


ko n2k 


will exist for almost all x, and the resulting function, which we may 
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define arbitrarily on a set of measure zero, will be in #’. By abuse of 
language, we still denote it by lim inf f,. 


Corollary 5.7 (Fatou’s Lemma). Let {f,} be a sequence of real valued 
functions > 0 in Y'. Assume that 


lim inf || fill, 


exists (so is a real number = 0). Then lim inf f,(x) exists for almost all 
x, the function lim inf f, is in Z*, and we have 


[tim in du tim int | f, du = lim inf || f,||,. 
X X 


Proof. We apply the monotone convergence theorem twice, first to 
the decreasing sequence {g,,} given by 


Im = inf( f;,, Set ae Sk+m): 


Since {g,,} is a decreasing sequence, converging to inf f,, and since 
nk 


[ints sss soften) S | feo for j=1,...,m 


we conclude from the monotone convergence theorem that 


[int J int | f= tim int | J 


nak nk ko nZzk 


Let h, = inf f,. Then {h,} is an increasing sequence for k = 1, 2, ..., and 
nZk 


we can apply the monotone convergence theorem to h,. The limit lim h, 


k- 00 


is precisely lim inf f,, and Fatou’s lemma drops out as desired. 


Note. Fatou’s lemma is used most often in the simple case when { f,} 
is pointwise convergent almost everywhere, and when the L'-seminorms 
\f,||, are bounded, thus ensuring that the pointwise lim f, is in #’. 


Theorem 5.8 (Dominated Convergence Theorem). Let {f,} be a se- 
quence of mappings in &*(p). Assume that there exists some function 
gé L'(pu, R) such that g =0 and |f,| <Q for all n. Assume that { f,' 
converges almost everywhere to some map f. Then f is in &* and { f,} 
is L'-convergent to f. 


142 THE GENERAL INTEGRAL [ VI, §5] 


Proof. For each positive integer k, let 


g, = sup In — fm: 


m,n2ak 


Then {g,} is a decreasing sequence of real valued functions, and since 
lf, — fn| < 2g, it follows from Corollary 5.6 that each g, is in Y*. By the 
monotone convergence theorem and the hypothesis, the sequence {g,} 
converges almost everywhere to 0. Hence {f,} is actually a Cauchy 
sequence, and we can apply Corollary 5.4 to conclude the proof. 


We now refer for the first time since the definition of Y' to the 
notion of u-measurability. The point is that we want to give criteria for 
the limit of a sequence of maps to be in ¥', and p-measurability is the 
natural hypothesis here. We refer the reader to M11 and emphasize the 
countability implications arising from a map being in #', and hence 
u-measurable (by definition). 


Corollary 5.9. Let f be p-measurable. Then f is in £*(u) if and only 
if its absolute value |f| is in L*(u,R). More generally, assume that 
there exists an element ge Y'(u,R) such that g =O and such that 
\fl<g. Then f is in Z*(p). 


Proof. Let {g,} be a sequence of step maps converging pointwise to 
f. Without loss of generality we can assume that g is measurable. (We 
may have to change all g,, f, and the given g on a set of measure 0.) 
Define a map h, by 


h,(x) = (x) if |p, (x)| S 29), 
h,(x) = 0 if [,(x)| > 29(%). 


The set S, of all x such that 2g(x) —|@,(x)| 20 is measurable, and it 
follows that h, is in Y'(u) for each n. Furthermore {h,} converges 
pointwise to f, and |h,| <2g. We can therefore apply the dominated 
convergence theorem to conclude the proof. 


Note. Corollary 5.9 explains the role of positivity in integration theory. 


Corollary 5.10. Let {f,} be a sequence of maps in £'(u) which con- 
verges pointwise almost everywhere to f. If there exists C 2 0 such that 
fl, < C for all n, then f is in Z* and || f\|, <C. 


Proof. All f, are u-measurable, and hence f is u-measurable, by M12 
of §1. By Corollary 5.9, it suffices to prove that |f] is in Y*(u, R). But 
| f| = lim|f,|, and Fatou’s lemma applies to conclude the proof. 
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Remark. In Corollary 5.10, we don’t assert of course that {f,} is 
L'-convergent to f. This is in general not true since for instance we can 
find a sequence { f,} converging everywhere to 0 such that each f, has 
ll f, (|, = 1. (Take very thin tall vertical strips moving towards the y-axis.) 
To get L'-convergence, we must of course cut down such f, in a manner 
similar to that used in Corollary 5.9. 


Corollary 5.11. Let fe Y'(p). Let g be a bounded measurable function 
on X (so real or complex). Then gf is in L(y). 


Proof. Let {g,} be a sequence of step maps converging both L’ and 
almost everywhere to f. Using M8 of §1, let {w,} be a sequence of 
simple functions converging pointwise to g. Then {g,w,} is a sequence of 
step maps, and as n— oo, this sequence converges almost everywhere to 
fg. Changing f and g on a set of measure 0 (e.g. giving them the value 
0), we can assume that this convergence is pointwise everywhere. If C 1s 
a bound for g, ie. |g(x)| SC for all x, then |fg|< C|f|. We can now 
apply Corollary 5.9 to conclude the proof that fg is in #*. We can also 
reproduce the proof of Corollary 5.9, 1.e. after suitable adjustment we 
may suppose that 


lal S 21f| 


for all n, whence |g,w,| < 2C|/| for all n, and then apply the dominated 
convergence theorem directly. 


Corollary 5.12. Let E x F >G be a continuous bilinear map of Banach 
spaces into another. Let fe L(y, E) and let g be a bounded p- 
measurable map of X into F. Then fg ¢ L'(p, G). 


Proof. There is nothing to change in the preceding proof. 


Corollary 5.13. Let {f,} be a sequence of maps in £' such that 


S| ld 


converges. Then the series 
fx) = ¥ fod 


converges almost everywhere, the map f is in Y', and 


| taw= 5 | fea 
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Proof. Immediate from the dominated convergence theorem, consider- 
ing the partial sums, and using the function 


g(x) = lim > |A(9) 


Example. It is often useful to consider sums as in Corollary 5.13 in 
the following context. Let {A,} be a sequence of disjoint measurable sets 
whose union is equal to X. For each n let f, be integrable over A,, and 
define f, to be 0 outside A,, so that f, is then defined over all of X. Let 


f=L 


(Conversely, if f is given on all of X, we could let f, = f, = fx,, where 
%4, 18 the characteristic function of A,.) If 


converges, then it follows that f is in Y' over all of X. 


Remark. In our discussion of measurability, we have already pointed 
out that a pointwise limit of step maps takes its values in a separable set, 
ic. having a countable dense subset. Actually, taking the space generated 
by the values of the step maps in a sequence converging to f we see that 
this space, and its closure, have a countable dense subset. This applies 
when f is in #' since we can change f on a set of measure 0, say giving 
f the value 0 on such a set, so that f is a pointwise limit of step maps 
on the complement of this set. Furthermore, we also recall that a limit 
of step maps vanishes outside a countable union of sets of finite measure, 
and this also applies to an element of #'. 


Corollary 5.14. Let f be in Y'. Given se, there exists a set of finite 


measure A such that 
| f du —- | f dp 
X A 


Proof. As we have remarked, we can change f on a set of measure 0 
such that f vanishes outside a countable union of sets of finite measure, 
say {A,}. Let 


< 6. 


B, = A, U*''UA,. 


The sets B, are increasing, and without loss of generality we may assume 
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that X =|) B,. Then 


| Pau i. f du| = 


| f du <| \fldu 
X-B,, X-B,, 


= \ fla — Xp,,) dy. 


We let A=B, for n large. Our corollary follows from the monotone 
convergence theorem. 


The next theorem has a probabilistic interpretation as follows. If A is 
a set of finite measure 4+ 0, we may view 


1 
a | £4 


as the average of f over A. The theorem will assert that if the average of 
f over all such A lies in some closed set S, then in fact the values f(x) 
must lie in S for almost all x. We call this the averaging theorem. 


Theorem 5.15. Let fe Y'(u, E). Let S be a closed subset of E and 
assume that for all measurable sets A of finite measure # 0 we have 


1 
aa | anes 


Suppose 0€S or X is o-finite. Then f(x)e€S for almost all x. 


Proof. Changing f on a set of measure 0, we may assume without 
loss of generality that f vanishes outside a set which is a countable union 
of sets of finite measure, and that E has a countable dense subset. It is 
then clear that it will suffice to prove our theorem under the additional 
assumption that p(X) < oc, which we now make. Let ve E and v€S. 
Let B.(v) be an open ball of radius r centered at v and not intersecting S. 
Let A be the set of all x € X such that f(x) e Bv). We prove that A has 
measure 0. Indeed, if (A) > 0 we have 


aw [14 |- ty | fa ap | 4 4 


sy | i-olde<n 


which is a contradiction. Hence yu(A) = 0. The lemma follows using the 
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countability assumption on E, and using a countable dense set in the 
complement of S, together with open balls of rational radii around the 
elements of this set, which form a base for the topology. 


Corollary 5.16. Let fe L'(p) and assume that 


| fau=0 


for every measurable set A of finite measure. Then f is equal to 0 
almost everywhere. 


Proof. We take S to consist of 0 alone, and apply the theorem. 


Corollary 5.17. Let fe &*(y). For each step function g the map fg is 
in L'(u), and if 


| fg du=0 
x 
for all step functions g, then f(x) =0 for almost all x. 


Proof. Apply Corollary 5.16 to characteristic functions y,. 


Corollary 5.18. Let fe Z'(u). Let b20. If 


| fa 


for all sets A of finite measure, then | f(x)| <b for almost all x. 


< by(A) 


Proof. Let S, be the subset of E consisting of those elements v such 
that |v| 2b + 1/n and apply the theorem. Then take the union for n = 1, 


5 eee © 


The next corollary is included for later applications. The reader inter- 
ested only in the case of complex or real functions may omit it. 


Corollary 5.19. Let E be a Hilbert space and fe #'(p, E). If 


| <fa>du=0 


for all step maps g, then f(x) = 0 for almost all x. 


[VI, $6] APPROXIMATIONS 147 


Proof. The proof is really just like that of Corollary 5.17. First we 
may assume that the image of f is contained in a separable Hilbert 
subspace. Let e be a unit vector. For any measurable set A of finite 
measure, the step map ey, having value e in A and O outside A 1s 
bounded measurable. Let us denote by f, the Fourier coefficient of f 
along e so that f, is a function. We have 


o=| ‘fet du = | fe dp. 
xX A 


This being true for all A it follows that f, is equal to 0 almost every- 
where. Since there is a countable Hilbert basis in our Hilbert space, it 
follows that f is 0 almost everywhere. 


Corollary 5.20. Let E be a Hilbert space and fe £'(u, E). For each 
unit vector ee E, let f, be the component of f along e. Let b2 0. 
Assume that for each unit vector e and each set of finite measure A we 
have 


| Se i S bp(A). 


Then | f(x)| < b for almost all x. 


Proof. We may assume that E is separable as in Theorem 5.15, and 
that p(X) < oo. Let ve E and |v|>b. Let Biv) be an open ball of 
radius r centered at v not intersecting B,(0). If A is the set of all xe X 
such that f(x) ¢ B.(v), we take e to be a unit vector in the direction of v. 
Let c=|v|. If xe A, then | f(x) —c| <r so that f(x) € Bic). By Corol- 
lary 5.18 it follows that A has measure 0. Our Corollary 5.20 follows at 
once. 


Vi, §6. APPROXIMATIONS 


We shall analyze Theorem 5.2 more closely, so as to fit certain situations 
which arise in practice. Let us look at a special case, the real line. The 
most natural definition of any integral on R is to start with step func- 
tions defined on bounded intervals (open, closed, or half open or closed), 
and define the integral for these. However, the sets which are finite 
unions of bounded intervals do not form a o-algebra, only an algebra. 
Thus we are faced with two problems: extend the measure (length) func- 
tion on bounded intervals to a measure on the o-algebra generated by 
the finite intervals, and second, show that the step functions taken with 
respect to finite intervals are still L'-dense in the Y'-completion. The 
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problem of extending the measure to a o-algebra is dealt with in §7. 
Here, we settle the other question, and a countability condition arises 
naturally. 

Let . be a subalgebra of .4@ and assume that » consists of sets of 
finite measure. We shall say that X is o-finite with respect to ./ if X is a 
countable union of elements of . Taking the usual inductive comple- 
mentation, we see that if X is o-finite with respect to .¥/, then in fact, 
there is a sequence {A,,} of disjoint elements of <M such that 


We recall that a step map f with respect to & is a map which is equal 
to 0 outside some element A of .% and such that there is a partition 
{A,,...,A,} of A consisting of elements of , such that f is step with 
respect to this partition. We shall denote the space of step maps with 
respect to YM by St(.x/). We are interested in giving conditions under 
which the closure of St(./) in Y'(u) is equal to ¥Y*. The next two 
lemmas lead to the theorem giving such criterion. We first consider 
those measurable subsets contained in some element A of &% Thus we 
denote by o&, the algebra induced by © on A, ie. the algebra of all 
elements of . contained in A. We let St(.e/,) be the vector space of step 
maps with respect to %,. 


Remark. Let Y be a measurable subset of an element A of o& and xy 
its characteristic function. Let @ be a step function such that 


IX¥y — Olly <e. 


If we let ~, = inf(g, 1), then |yy — 9,| < |zy — @|, and hence 


IXy — All; <e 


We have a similar situation taking sup(g, 0). We are interested in those 
Y such that zy lies in the closure of St(./,, R). Our remark shows that 
in determining those Y, we may restrict our attention to those step func- 
tions @ such that 

Osgsl. 


For what follows, we also observe that St(.v/,,R) is closed under the 
operations of sup and inf. 


Lemma 6.1. Let A be an element of &. Let WN, be the collection 
of measurable subsets Y of A whose characteristic function yy lies in 
the L'-closure of St(0¢,,R), i.e. such that given &, there exists a step 
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function  € St(.%,, R) satisfying 
(*) Ixy — Gli <e 


Then WN, is a o-algebra in A. 
Proof. First we show that VY, is an algebra. If Y, Ze .%,, then 


sup(Yy, Xz) = Xyvuz and inf(xy, Xz) = Xyaz 


are in V,. Also, y, — Yy = X,4-y 1S in Y,. Hence .V, is an algebra. To 
show that it is a o-algebra, it suffices to show that if {Y,} is a sequence 
in %, of disjoint elements, then |) Y, is 1%. (If we have an arbitrary 
sequence in .V,, we can always adjust it by taking relative complementa- 
tions to yield another sequence of disjoint elements in .,, having the 
same union.) Thus let {Y,} be a disjoint sequence in 1, and let {,} be 
step functions in St(.%,, R) such that 


é 


_ < . 
lXy,, — Pnlls an 


Let 


Then 


n 


Xy¥U+UY, y Pr; 


k=1 


< |lxy - Xy UY, lla + 


xy — » Pr 
k=1 1 


We take n so large that the first term on the right is < ¢. The second 
term on the right is estimated by 


1 


n 
y IlXy, — Prlla < 
k=1 
This proves that 1, is a o-algebra in A. 
The next lemma pertains to a completely general situation. 


Lemma 6.2. Let {A;};-; be a family of sets whose union is equal to X. 
For each i, let WW; be a o-algebra of subsets of A;. Let NW be the 
collection of subsets Y of X such that YO A;€ WN; for all i. Then NV is 
a o-algebra in X. 


Proof. Let Ye NW. Then @YOA;=A;— Y. Hence @Ye™M. Let Y, 
ZeéeEN. Then 
(Yn Z)N A; = (YAA;) N(Z 0 4A;) 
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whence YNZ is in WY. Let {¥,} be a sequence of subsets of X in 
Then 


( )oAi= OC (40 A) 
k=1 


k=1 
whence |) ¥, is in “. This proves our lemma. 


Theorem 6.3. Let © be a subalgebra of .M, consisting of sets of finite 
measure, generating .. Assume that X is o-finite with respect to WV. 
Then the space St() of step mappings with respect to & is dense in 
&"(pu, E). Furthermore, if {A,} is a sequence in & whose union is X, 
then for all Ye M, xy,4, lies in the L*-closure of St(¥,4,R), for all n. 


Proof. We prove the second assertion first. By Lemma 6.1, we have a 
o-algebra VY, , and we apply Lemma 6.2. Every element of is such 
that AN A,e.%,, and since ~ generates .@, we conclude that VY = 4. 
Next, we prove a special case of our first statement: 


If Y is a measurable set of finite measure, given & there exists a step 
function @ with respect to & such that 


Ixy — Glla <e. 


Taking relative complements, we may assume the A, are disjoint. By 
Lemmas 6.1 and 6.2, for each n there exists a step function g, with 
respect to .~ such that 


| la<- 
Xyna,, Pn Ili qn 
Since Y is the union of all sets YO A,, we can find some n such that 
uly —-U (rn Ay) <6, 
k=1 
or in other words such that 


n 
Xy — Y Xyna, || < & 
k=1 1 


It follows that 


Xy — Y DP 
k=1 1 


IA 


n n n 
Xy — Y Xyna, +|5 XYRA, Yo 
k=1 1 k=1 k=1 


1 


< 2é. 


This proves our special case. 
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The general case is now obvious: a step map f # 0 with respect to all 
sets of finite measure is a finite linear combination 


with v;e€ E, v; #0 for all j, and such that the sets Y; have finite measure. 
By definition, the space of these maps is L'-dense in #*(u, E). For each 
xy, we can find a step function g; with respect to such that 


Ixy, — Pylls m|v; 
It follows immediately that 
f 7 » VjQj\| <é 
j=l 1 


This proves our theorem. 
We can now strengthen the corollaries of Theorem 5.15. 


Corollary 6.4. Let © be a subalgebra of 4, consisting of sets of finite 
measure, generating M. Assume that X is o-finite with respect to &. 


Let fe FZ'(p). If 
| fau=0 
A 


for all A € &, then f is equal to 0 almost everywhere. 


Proof. Our assumption implies by linearity that 


| fo du =0 


for all real step functions @ with respect to oe Let Y be a set of finite 
measure. By Theorem 6.3 and Lemma 3.1, we can find a sequence of 
step functions {,} with respect to .& which converges almost everywhere 
to yy and is also L‘-convergent to yy. Taking inf(g,, 1) and sup(q,, 0) 
if necessary, we may assume without loss of generality that O< 9, < 1. 
Then 


foul SAF 


and { fg,} converges almost everywhere to fyy. By the dominated con- 
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vergence theorem it follows that 


0 -| f@, converges to | tity -| f. 
X X Y 


[y- 


for all sets of finite measure Y. By the o-finiteness, every measurable set 
is a countable union of sets of finite measure. Since fyy =0 almost 
everywhere by Theorem 5.15, we conclude that f = 0 almost everywhere, 
thus proving our corollary. 


This proves that 


Example. Take E=R and let X =R also. Let © be the algebra 
consisting of sets which are finite unions of bounded intervals (obviously 
an algebra). We shall show in §9 that there is a unique measure on the 
o-algebra generated by © such that the measure of an interval is its 
length. Thus we can develop integration theory on the reals, and we can 
apply the corollary to Theorem 6.3. Furthermore, the infinitely differen- 
tiable functions which vanish outside a compact set are dense in #'. In 
fact, given a characteristic function yy of a finite interval, we can find a 
C® function @ which is equal to yy except in a given é-neighborhood of 
the two end points of the interval, and 0< @<1. Thus as an applica- 
tion of our corollary, we see that if 


| foe =9 
R 


for all C® functions g vanishing outside some compact set, then f is equal 
to 0 almost everywhere. We shall state this result formally later in R”. 


Remark. We observe that the domain of validity of Theorem 6.3 is 
actually greater than it seems, ie. the hypothesis of o-finiteness is to 
some extent superfluous. Indeed, every map in #'() being a limit al- 
most everywhere of step maps, must vanish outside some set which is a 
denumerable union of sets of finite measure. In determining a dense 
subset of #', we are merely attempting to approximate each individual 
map f. Thus the hypothesis under which Theorem 6.3 holds can actually 
be weakened to the following: 


Every set of finite measure is contained in a countable union of sets of 


All the applications I know of actually occur in the o-finite case as we 
defined o-finite, but one should keep in mind that in case of need, one 
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could take the preceding property as the definition of o-finiteness with 
respect to &%, and still end up with the corresponding result. This remark 
is the analogue with respect to the domain set of the remark preceding 
Corollary 5.14, with respect to the image space. We see that the #’ 
theory has a built-in countability property for each one of its elements. 


VI, §7. EXTENSION OF POSITIVE MEASURES 
FROM ALGEBRAS TO o-ALGEBRAS 


In the previous sections, we started with a positive measure on a o- 
algebra, and then defined the integral for certain limits of step maps. We 
now want to show how we can obtain such measures starting with fewer 
data. 

We recall that an algebra ./ of subsets of X is a collection of subsets 
containing the empty set, such that © is closed under finite unions and 
intersections, and such that if A, Be &% then A —- Be & 

By a positive measure on an algebra .°, we mean a map 


py: A — [0, c] 


such that uw(@) = 0, and such that pw is countably additive on <& This 
means that if {A,} is a sequence of disjoint elements of .~, and if their 
union |) A, is also in .o~, then 


u( U A,) = >, u(A,). 


Under a suitable countability assumption, we shall prove that a mea- 
sure on an algebra can be extended uniquely to a measure on the 
o-algebra generated by .% Observe that the countability condition is 
necessary for this to be possible, i.e. we could not merely assume that py 
is finitely additive on » For instance, consider a denumerable set X = 
{x,}, and let © be the algebra of all subsets. Let x, have measure 1/2", 
and let a finite set have measure equal to the sum of the measures of its 
elements. Let an infinite set have infinite measure. Then we have defined 
a finitely additive function which is not a measure. 


Theorem 7.1 (Hahn). Let pu be a positive measure on an algebra & in 
X, and assume that X can be expressed as a denumerable union of sets 
of & Then uw can be extended to a positive measure on the o-algebra 
M generated by &%, so that for Y € 7, 


w(Y) = inf > H(A,) 
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the inf being taken over all sequences {A,} in & whose union contains 
Y. If X can be expressed as a countable union of sets of finite measure 
in &, then there exists a unique extension of pw to a positive measure on 
M. 


Proof. The proof will proceed in two steps and needs the notion of an 
outer measure. 


Let W be a o-algebra in a set X. An outer measure pv on WV is a 
function uw: V > [0, oo] satisfying the conditions: 


OM 1. We have u(@) = 0. 
OM 2. If A, Be VY and Ac B, then p(A) S p(B). 
OM 3. If {A,$ is a sequence of elements of NV, then 


u( U A,) s >, uA,). 


Lemma 7.2. Let u be a positive measure on an algebra & in X, and 
assume that X can be expressed as a denumerable union of sets of &. 
On the o-algebra of all subsets of X, define 


u*(Y) = inf Y) w(A,), 


the inf being taken over all sequences {A,‘ of elements of & whose 
union contains Y. Then p* is an outer measure which extends wu. 


Proof. We first show that if A € a, then y*(A) = (A), in other words 
u*® extends pw. Since 
A=AUDUGvu::: 


we see that u*(A) S u(A). Conversely, given «, let {A,} be a sequence of 
elements of ./ whose union covers A, and such that 


¥ u(A,) S w*(A) + . 
Since A = | )(A,.- A) it follows that 
w(A) $Y lA, A) SY wlA,) < w*(A) +e 
This proves that u(A) < u*(A), whence p(A) = u*(A), as desired. 


From now on we omit the * on yu since u* and yp take the same values 
on & We show that our extended yp is an outer measure. The first two 
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properties OM 1 and OM 2 are obvious. As for OM 3, it is clearly an 
¢/2" proof: let {Y,;} be a sequence of subsets of X which we may assume 
have finite measure. Given «¢, for each j, let {AY} (n=1,2,...) be a 
sequence in . whose union covers Y, and such that 


20 . é 
Y WAY) < WY) +5 
n=1 2 


Then the denumerable family {AY} (for j, n positive integers) covers 
|) Y,, and we have 


u(Y) SD (AY) S X H(Y;) + €. 
n,J J= 
This proves our proposition. 


Let p be an outer measure on the set of all subsets of X. We say that 
a subset A of X is p-measurable if for all subsets Z of X we have 


u(Z) = uw(Z A A) + “(ZA @A). 


Lemma 7.3. Let be an outer measure on the subsets of X. Let S be 
the collection of all subsets of X which are p-measurable. Then Ff is a 
o-algebra, and is a positive measure on F. 


Proof. Since we deal only with p, we omit the prefix p-. We first 
prove that F is an algebra. It obviously contains the empty set, and if 
A is measurable, it is clear that @A is measurable (the definition of 
measurable is symmetric in A and @A). Let A, B be measurable. We 
show that AB is measurable. Let Z be any subset of X. Since B is 
measurable, we get 


UZ AAQB)+ WZAANGB) = W(Z 0 A). 


Add pu(Z A @A) to both sides. On the right we obtain u(Z) because A is 
measurable. To prove that AB is measurable, it will suffice to prove 
that 

UZ ACA B)) = UZAAN GB) + WZ GA). 


But this is seen by using the fact that A is measurable, and writing 
UZ AEA B)) = UZAGANB)OA) + UZAGAAB)A@A). 


Thus AB is measurable. 
Next we observe that if A,, ...,A, are disjoint measurable sets, and Z 
is arbitrary, then 


WZ (A, U-VAy)) = WZ 0 Ay) 
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This follows for n = 2, replacing Z by ZA(A,UA,) in the definition of 
measurability, and then by induction. Let now {A,} be a sequence of 
disjoint measurable sets, and let A be their union. Using the fact that yw 
is an outer measure, we get for any subset Z: 


u(Z) = u(Z 0 (A, UV A,)) + UZ OA, UU A,)) 


> Y WZAA,) + ZA) 
k=1 


for all n, whence 
WZ) 2 Yo WZ Ay) + ZEA) 
2 UZOA) + WZOAEA) 
because p is an outer measure. The converse inequality 
MZ) S MZ OA) + WZ GA) 


is true again because yp is an outer measure. Thus we have equality. 
This proves both that A is measurable, so the measurable sets form a 
o-algebra, and that pw is countably additive on Y, thus concluding the 
proof of the lemma. 


To prove the existence part of the theorem, all we need to show now 
is that the sets of our original algebra of are measurable. Let Ae «W and 
let Z be any subset of X. The inequality 


u(Z) S w(ZO A) + WZ GA) 


is true because w is an outer measure. Conversely, given ¢ let {A,} be a 
sequence in & whose union covers Z and such that 


Y uA) S u(Z) + €. 


Then ZA is contained in the union of the sets A, 7A, and ZQO@A 1s 
contained in the union of the sets A, 0 @A = A, — A. Consequently 


u(Z a A) + UZO@A) SY uA, 9 A) + ¥ (A, 0 A) 
= ¥ u(A,) 
S w(Z) + «. 


This proves the reverse inequality, and proves the existence of an exten- 
sion of p to a measure on Y, whence on the o-algebra generated by a. 
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Now for the uniqueness, we let be as we have just constructed it, 
and let v be any positive measure on the o-algebra .@ generated by 
%, extending uw on &% Let {A,} be a sequence in © of sets of finite 
y-measure, whose union is X. For any given Y it suffices to prove that 


v(Y OA,) = u(Y 0 A,). 


Thus it suffices to prove: if A € .~ has finite measure and Y is in @ and 
contained in A, then v(Y) = p(Y). We have 


w(Y) = inf Y p(B,) = inf Y v(B,) 


the inf taken over all sequences {B,} in »« whose union contains Y. This 
shows that v(Y) S$ y(Y). But then also, 


v(A— Y)S w(A-— Y). 
However 


(A) = v(A) = (A — Y) + v(Y) S w(A — Y) + w(Y) = uA). 


This proves that we must have v(Y) = u(Y) and concludes the proof of 
the theorem. (For another proof of uniqueness, cf. Exercise 10(b).) 


Corollary 7.4. Let (X,.4,u) be a measured space, and let & be a 
subalgebra of .M consisting of sets of finite measure, generating .4, and 
such that X is a-finite with respect to W. A subset Z of X has p*- 
measure 0 if and only if given s, there exists a sequence {A,} in & 
whose union covers Z and such that 


Ms 


MA,) < €. 


n=1 


Similarly for a set Ze M of measure 0. 


Proof. It is clear that a set satisfying the stated condition has measure 
0. Conversely, we know from the theorem that the measure on o& has a 
unique extension to .@ given as the outer measure. From this our 
assertion is obvious. 


Remark. In euclidean space, with respect to Lebesgue measure (dis- 
cussed later), the algebra is that formed of finite disjoint unions of 
cubes. Thus a set has measure 0 in R" if and only if given e« it can be 
covered by a sequence of cubes, the sum of whose volumes is < «. In 
many applications, one deals exclusively with sets of measure 0, and one 
does not need any fancy measure theory or integration theory. Thus the 
reader should keep this in mind so as to be more comfortable when 
meeting such applications. 
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VI, §8. PRODUCT MEASURES AND INTEGRATION 
ON A PRODUCT SPACE 


Let X, Y be sets and x, B& algebras of subsets in X, Y respectively. By a 
rectangle with respect to .~ Z we mean a product A x B with Ace W 
and Be Z. We let wx Z denote the collection of all finite disjoint 
unions of rectangles with respect to .~, @. (Unless needed for clarity, we 
omit the reference to <& BZ in what follows.) We contend that WM x B is 
an algebra, in X x Y. This is easily proved as follows. 

The empty set is in & x BZ. We have the identities: 


(A; x By) A(A, x By) = (A, A)) x (BL OB) 


and 
(A, x B,) — (A, x B,) =[(A, — A2) x B,J U[L(A; OA) x (B, — B,)I. 
If P, O¢ SL x B these show that both PN QO and P—Qe x x &. Since 
PUQ=(P—Q)UQ 


and (P —Q)nQ is empty, it follows that PUQE Aw x Z. This proves 
that < x Z is an algebra. 


We denote by Y@&Z the oa-algebra generated by YW x B. Also, we 
denote by &° the a-algebra generated by Y in X. We have 


A? @ Bo =(A x BY’. 
Proof. Since (% x BZ) < (A x B) <— (HM © B’) it follows that 
(S x BY — A’ @ B’, 
and we must prove the reverse inclusion. For each Be & consider the 
o-algebra in X x B generated by all sets A x B with Ae & It is con- 
tained in (./ x #)’, which therefore contains ./% x {B} for all Be Z&. 


Now for any Ae’, it follows that {A} x B is contained in (W x B)’. 
Thus finally, 


A> xX Bo (A x BY, 
whence the reverse inclusion 
A? QB (A x BY’, 


which proves what we wanted. 
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Lemma 8.1. Let @ be a a-algebra in X and WN a o-algebra in Y. 


(i) Let O€. M@WN and for each xEX let Q, be the set of y such 
that (x, y)€ QO. Then Q,€ NV. 

(ii) Let f:X x YZ be a M@® WN measurable map into a topological 
space Z. For each x € X, the map 


fi YoZ 
given by f.(y) = f(x, y) is measurable. 


Proof. Let SY be the collection of subsets QE. Z® WV such that 
Q.eWN for all x. Then ¥ contains all rectangles A x B with Ae. 
and Be WV. It will suffice to prove that Y is a o-algebra. The point is 
that the operation Q++Q, commutes with all the operations of set 
theory. Indeed, Xx YeY If OEY, then GQEYF because (GQ), = 
@(Q,.). If Q, P are in &, then 


(PO Q), = P.O QO. 


If {Q,} is a sequence in Y, then (() Q,), =U (Q,).. Thus we see that 
FY is a a-algebra. This proves (i). As for (ii), if V is open in Z, then 


(f"(V)). = fe), 
so f, is measurable. This proves the lemma. 
For the rest of this section we let (X,-4u) and (Y, V,v) be a-finite 
measured spaces. Let sf and # be the algebras of sets of finite mea- 
sure in M and WN respectively. 
If f is a step map with respect to & x Z, then we can define a 


repeated integral of f. Indeed, for each xe X the map f, is a step 
map on Y with respect to #. In fact, if 


f= VXAxB 
for some ve E and Ac & Be &@, then 


fe = vV(Laxs)s =0XA(%)X—p = and — f(y) = 0X 4() XB(Y). 


Our assertion follows by linearity. Thus for each x eX, we can form 


a first integral, 
| ff, a. 
Y 
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If f = v%4yp aS above, we see that 


| J, dv = vx 4(x)v(B). 
Y 


If f is a step map with respect to & x @, we conclude that the map 


| f,.dv 
Y 


is a step map with respect to oa. 
We may therefore integrate this map over X, with respect to p, and 
the repeated integral will be denoted by any one of the following 


notations: 
[ || Ix dv laut [ d(x) [ f. dv, 
{ | f(x, y) dv(y) du(x), [. [ f dv du 


We use similar notation if we reverse the order of integration, and it is 
clear that on step maps, the repeated integrals are equal to each other, 
no matter what order of integration is chosen. In fact, we see at once 
that for Ae Y and BE B we have 


| | taxa dx = AB) = | | Yaxp ap dv. 
x JY Y JX 


The repeated integral is linear on the space of step maps. 


Proof. Obvious, because each one of the single integrals is linear. 
In particular, there is a unique finitely additive positive function p x v 
on Y x # such that for Ae .Y% and BE B we have 


(u x v)(A x B) = p(A)v(B). 
Theorem 8.2. Let (X,.4, 1) and (Y,.V, v) be o-finite measured spaces. 


There exists a unique positive measure 1®v on M@ WN such that for 
all sets A, B of finite measure in 4 and WN respectively we have 


(u@ v)(A x B) = y(A)v(B). 


Proof. By Hahn’s theorem, it suffices to prove that u x v is countably 
additive on w x &, i.e. is a measure on WY x BZ, where oY, BZ are the 
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algebras of sets of finite measure in .W and VW respectively. Let {Q,} 
be an increasing sequence in »& x # whose union is an element Q of 
Sx B. Let f, be the characteristic function of Q,. Then {f,} is increas- 
ing to the characteristic function f of Q. Furthermore, for each x eX, 
the function (f,), 18 increasing to f,. By the monotone convergence 
theorem with respect to v, we see that for each x, 


| (f,). dv is increasing to | f, dv. 
Y Y 


Now we apply the monotone convergence theorem with respect to p, to 
conclude that 


[| sa du converges to | | fardu 
X JY xX JY 


This proves our theorem. 


Lemma 8.3. Let Z be a set of (u@ v)-measure 0 in X x Y. Then for 
almost all x € X we have v(Z,.) = 0. 


Proof. For each positive integer n, let S, be the set of all x such that 
v(Z,) 2 1i/n. Let S =|) S,. It will suffice to prove that S is contained in 
a set of measure 0. Given ¢, let {R,} be a sequence of rectangles whose 
union contains Z and such that 


00 g 
» (u x v)(R,) < nO" 


Such {R,} exists by the corollary of Hahn’s theorem. Then 


Then T, is measurable, and S, < T,. Furthermore, the expression on the 
right is integrable with respect to x, and we find that 


1 ee) fo) 
7 MTn) s X |, vV(R,, x) du = X (u x v)(R,) < 9 
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This shows that u(T,,) < ¢/2”, whence S is contained in a set of measure 0, 
thereby proving our lemma. The converse will follow from Corollary 8.5. 


Suppose that f is in #'(u @ v, E) and let g differ from f only on a set 
of (u® v)-measure 0, say Z. Then for each xe X, the maps f, and g, 
differ only at those points y € Y such that (x, y) € Z, 1.e. those y such that 
yeéZ,. By the lemma, there exists a set S of measure 0 in X such that 
for all x €S we have v(Z,)=0. From this we conclude that for such 
x ¢ S, the maps f, and g, differ only on a set of measure 0. Thus f, is in 
L'(v, E) if and only if g, is in Y'(v, E), and if this is the case, the 
integrals with respect to v will be equal. This is the situation which we 
meet in the next theorem. 


Theorem 8.4 (Fubini’s Theorem, Part 1). Let fe £'(u@v). Then for 
almost all x, the map f. is in Z*(v), the map given by 


o | f,. dv 
Y 


for almost all x (and defined arbitrarily for other x) is in Z*(u); and we 


have 
{ fdu@r= | | 4 dv du(x). 


There is a natural Banach space norm preserving isomorphism 
L*(u® v, E) > L*(u, L*(v, E)). 


Proof. By Theorem 6.3, we can find a sequence {g,} of step mappings 
with respect to . x @ which converges to f both in L’-seminorm and 
almost everywhere on X x Y. As before, <7, @ are the algebras of sets of 
finite measure in “W and W respectively. We let Z be a set of (u @ v)- 
measure 0 in X x Y such that {g,} converges pointwise to f outside Z. 
We let S be a set of p-measure 0 in X such that for xéS we have 
v(Z,,) = 0. 


If x ¢ S, it follows that {g, ,} converges pointwise to f, on the comple- 
ment of Z,. 
Now we observe that for each n, the map 


D,: XF> Dn x 


is a map of X into St(#). Indeed, g, , is a step map with respect to &, 
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and for ve E, the formula 


(VX4 x B)x = VX 4(X) XB 


shows that ®, is step with respect to & We view the space St(#) as 
having the L'-seminorm. We contend that {®,} is a Cauchy sequence. 
This is easily seen, because 


||P, _ Din lls = | |®, ~~ Onl dy 
x 


= { | | n(x, Y) — Pm(x, y)| dv(y) du(x) 


= IP, _ Om ll 1 - 


(Of course, the L'-seminorms taken on the right and left of the preceding 
equation refer to different spaces.) 

By the fundamental Lemma 3.1, we may assume without loss of gener- 
ality (using a subsequence if necessary) that there exists a set T of mea- 
sure 0 in X such that for x ¢ T the sequence {®,(x)} is Cauchy. [Lemma 
3.1 and its proof are valid for values in a Banach space. For our 
purposes, we note that the proof of this lemma applies as well in the 
seminormed case to yield a pointwise Cauchy sequence for almost all x. 
Alternatively, we may also take the natural map of St(#) into L*(v) and 
apply the lemma with respect to the Banach space L'(v).] This means 
that for each x ¢ T, the sequence 


{,(x)} = {Pn x} 
is Cauchy (that is, L’-Cauchy with respect to v). If x¢é SUT, we know 
that {¢, ,(y)} converges to { f,(y)} for almost all ye Y. Hence by Corol- 
lary 5.10, we conclude that f, ¢ #*(v) and that {g, ,} is L*-convergent to 
f., so that 


| @,,, dv converges to | f,dv 
Y Y 


for all x€ SUT. 
Finally, we note that the map 


Y,: | Dy, x AV 
Y 


is a step map with respect to ©. [It is in fact the composite map of ©, 
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and the integral {, dv.] Furthermore, the sequence {¥,} is Cauchy (L’ 
with respect to py), as one sees by repeating the argument given above to 
show that {®,} is Cauchy. Also for all x¢ SUT we know that P,(x) 
converges to the map given by 


V(x) = | fav. 
Y 
Consequently {¥,} is L'-convergent to ‘P, and as n- 00, 


| | @,y,. dv du(x) converges to | | f. dv du(x). 
X JY X JY 


Since @, 1S a step map and 


[| ened dutsy= | Pn Up @ v), 
XWwJY XxyY 


we see that Fubini’s theorem is proved. 


Corollary 8.5. Let Q be a measurable subset of finite measure in 
X x Y. Then 


| Xa d(u®v)= | | (Xo)x dv du(x). 
xxyY x JY 


Proof. If Q has finite measure, then yg is in #'(u@v) and the 
theorem applies. 


Remark. Our version of Fubini’s theorem as it applies to the situation 
in the corollary does not yield the fact that the map 


Xt | : (X9)x dv = v(Q,) 


is measurable (only that it is u-measurable). It happens to be true that 
the map is in fact measurable. Cf. Exercise 11. 


In Fubini’s theorem, we start with a map f € #'(u® v) and conclude 
that the various partial mappings arising from this f are in the corre- 
sponding Y* spaces. One can ask for the converse, which is true, prop- 
erly formulated. 


Lemma 8.6. Let f: X x YE be a (u®@ v)-measurable map. Then for 
almost all x, the map f, is v-measurable. 
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Proof. Let Z be a set of measure 0 in X x Y such that the restriction 
of f to the complement of Z is measurable, and the image of the comple- 
ment of Z in X x Y is separable. By Lemma 8.3, for almost all x the set 
Z,, has measure 0, and by Lemma 8.1 the restriction of f, to the comple- 
ment of Z, in Y is measurable, whence v-measurable by M11. This 
proves our lemma. 


Theorem 8.7 (Fubini’s Theorem, Part 2). Let f:X x YE be a 
(4 © v)-measurable map. Assume that for almost all xe X the map f, is 
in £'(v), and that the map given by 


| | f.| dv 
Y 


(for almost all x, and arbitrary otherwise) is in #*(pu, R). Then 
fe L*(u®v, E) 
and Part 1 of Fubini’s theorem applies. 


Proof. By Corollary 5.9 of the dominated convergence theorem, it 
suffices to prove that |f| is in Y'(u~@v,R), and thus we may assume 
without loss of generality that f is a semipositive real function which 
is (u © v)-measurable, satisfying the other hypotheses of the theorem. By 
condition M9 of §1, we can find a sequence of positive simple functions 
{o,} which is increasing to f pointwise everywhere (changing f if neces- 
sary on a set of measure 0). Using the o-finiteness of X x Y, we may 
assume further without loss of generality that each g, vanishes outside 
a set of finite measure, ic. is step. For each x the sequence {@,. Hs is 
increasing to f,. Whenever x is such that f, is in #%’, and g,, 
v-measurable, it follows that as n — oo, 


| @,, dv is increasing and convergent to | f, dv. 
Y Y 


We can apply the corollary of Fubini’s theorem (by linearity), and the 
monotone convergence theorem once more to conclude that the sequence 


given by 
1 Pn A @ p = | | Pn,x AV aut) 
XxyY XWJY 


is increasing and convergent to 


| | f, dv du(x). 
X JY 
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A final application of the monotone convergence theorem shows that f is 
in #', thus proving our theorem. 


VI, §9. THE LEBESGUE INTEGRAL IN R? 


We start with R, and the algebra of subsets consisting of finite disjoint 
unions of intervals. The length function is easily seen to extend to a 
finitely additive function on this algebra. To get our theory going, we 
must show that it is a measure, 1.e. countably additive. It is in fact just 
as convenient to prove a slightly more general statement. 


Theorem 9.1. Let {f,} be a sequence of functions 2 0 on a closed 
bounded interval I, decreasing monotonically to 0. Assume that each f, 
is a step function with respect to intervals. Then the sequence of (plain 
and ordinary) integrals 


| In(x) dx 
I 


decreases to 0. 
Proof. For each n, the intervals on which f, is constant have a finite 
number of end points. The union of such end points for all such inter- 


vals and all n= 1, 2, ... is countable, and can therefore be covered by a 
sequence of open intervals J, such that 


YY (d,) < & 
k=1 


where | is the length. Let U=|)J,. If xeI and x¢U, there exists 
some n, and an open interval V, containing x such that f, (¢) < « for all 
te V,. Since the sequence {f,} is decreasing, it follows that f,,(t) < ¢ for 
all m 2 n, and all te V,. The family of open sets 
{J,., V;.} k=1,2,...;5 xel, x€U, 

covers I, and hence there exists a finite subcovering 

hao seats Vers VW. 
Let N = max(n,,,...,n,.). If n 2 N, then 


f,(t) < if tev Ue UK,. 


The integral |, f,(x) dx is bounded by the sum of the integrals of f, over 
the intervals J,,, ...,5,, and over the union V, U::'UV,. If C is a 
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bound for f, (and hence all f,) we conclude that 


| f,(x) dx S Ce + (De, 


which proves our theorem. 


Corollary 9.2. The length function of intervals extends uniquely to a 
measure on the algebra consisting of finite disjoint unions of bounded 
intervals. 


Proof. If {A,} is a disjoint sequence in the algebra, whose union is an 
element A of the algebra, then the characteristic functions 


{XA — XB, } = {Xa—B,} with B, = A, U""'UA, 


forms a decreasing sequence of step functions, converging pointwise to 0, 
to which we can apply the theorem. 


Having our measure on the algebra of finite disjoint unions of bounded 
intervals, we can first obtain a o-algebra and a measure on it by Hahn’s 
theorem. Then §3 gives us the integral on the reals. We can apply the 
theory of integration on product spaces to get the integral on R?, because 
R is obviously o-finite with respect to bounded intervals. Thus we now 
have integration on R?. The o-algebra of measurable sets in R? obtained 
by the preceding procedure is that generated by the rectangles (i.e. p-fold 
Cartesian products of intervals), and is thus the o-algebra of Borel sets in 
-R?’. The measure on this algebra obtained as above is called Lebesgue 
measure. One can also extend it to the completion of the o-algebra 
generated by rectangles (cf. Exercise 7). This makes no difference con- 
cerning integration, and is in fact frequently very convenient. 

We observe that for Lebesgue measure, rectangles have the expected 
measure, namely the product of the lengths of the sides. 


For the rest of this section, we let denote Lebesgue measure. One 
customarily writes £1(R°) instead of L(y) in this case. 


It is clear that R? is o-finite, being a union of bounded rectangles, 1. 
p-dimensional rectangles. Thus we can apply the density statement con- 
cerning step mappings with respect to finite unions of rectangles. We 
shall give an application of Corollary 6.4. 

If o is a function on R’, we say that @ has compact support if 
(x) = 0 for x outside some compact set. We let C°(R?, C) be the space 
of C® (infinitely differentiable) functions (complex) with compact support. 
It is clearly a vector space. 
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Theorem 9.3. Let 
fe L*(u). 
If 


[. fo du =0 


for all o € CY(R”, C), then f is equal to 0 almost everywhere. 


Proof. According to Corollary 6.4 it suffices to prove that 


| fau=0 


for all bounded rectangles A. We shall recall below how to approximate 
a characteristic function y, of a rectangle by a C® function with compact 
support, both almost everywhere and for the L'-seminorm. In other 
words, we can find a sequence {g,} of C® functions with compact sup- 
port which tends almost everywhere to y, and is bounded, say by a 
constant C. Then {g,f} tends almost everywhere to f, = y4f, and each 
o,f is in Y' by Corollary 5.11 of the dominated convergence theorem. 
Applying the dominated convergence theorem, we conclude that {9,f} is 
L'-convergent to f,, whence 


| ~,f du converges to | fi, du. 
RP RP 


This proves what we wanted. 


We now recall the construction mentioned in our proof. It is basically 
a one-dimensional construction. Let a, b be real and a < b. The function 


h(t)=eVMe-oO-)9 if a<t<b, 


h(t) = 0 if tx<aort2b, 


is a bell-shaped C® function which looks as follows: 


The function 


g: cr | h(t) dt = g(x) 
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then starts from 0 and climbs between a and b to a constant value, 
looking like this: 


Multiplying by a positive constant, we can assume that the top value is 
equal to any given number > 0. 

If we make a translation on g we can assume for instance that a = 0. 
Considering the function g(cx) instead of g(x) where c is a large constant, 
we can make the climb arbitrarily steep. Combining translations and 
such steep climbs, we can then find a function which is C®~ and looks 
like this: 


a b 


In other words, this function approximates the characteristic function of 
[a,b] from below. We can do the same thing from above. Taking 
suitable products to do the same thing in p-space, we end up with the 
following result. 


Lemma 9.4, Let A be a bounded rectangle in R?. Given &, there exist 
C” functions p, W having the following properties: 


(1) We have 


(11) We have 
| (Wy —o)du<e. 
RP 


In fact, if 
A=[a,,b,] x --- x [a,, 5], 


the function w is 0 outside the rectangle 
[a, —@,b, +e] x-:: x [a,—6,b, + €] 
and the function @ is 1 on the rectangle 


[a, +6,b, —e] x-:: x [a, +6,b, —é]. 
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Observe that in deriving this lemma, we are dealing with the simplest 
case of Riemann integration. The lemma is at the level of elementary 
calculus. 

The result of Theorem 9.3 really concerns the values of the map f on 
bounded sets of R’. In many applications, it is not convenient to restrict 
oneself to elements of #1, and one needs a formulation which allows us 
to deal with maps locally. Thus we say that a map f: R?— E is locally 
integrable if for each compact set K in R? the map fx (equal to f on K 
and 0 outside K) is in Z'(p). 


Corollary 9.5. Let f be a locally integrable map on R? such that for all 
o € C>(R?, C) we have 


| fo du =0. 
RP 


Then f is equal to 0 almost everywhere. 


Proof. This is really what Theorem 9.3 proved, since all we have to 
consider is f, for every bounded rectangle A. 


Theorem 9.6. The space C%(R’) is dense in £#*(p, C). 


Proof. We may restrict ourselves to the real functions. We know that 
the step functions with respect to rectangles are dense in ¥*. On the 
other hand, the characteristic function of a rectangle can be approxi- 
mated by C® functions with compact support, as we saw above for the 
proof of Theorem 9.3. The assertion of our corollary follows at once. 


Let f:R’— E be a map, and let ae R’. We define the translation Tats 
also written f,, to be the map given by 


(t. f(x) = f(x — a). 
If Y is a subset of R”, we define 
Y,=1t(Y)=Y+a 


to be the set of all points x + a with xe Y. Our definitions are adjusted 
in such a way that 


Ta(Xy) — Xy,, , 


Theorem 9.7. The Lebesgue integral is translation invariant. This 
means: If f € £+(), then for each ae R? the map t,f is in L'(u), and 


we have 
| caf du = | f dp. 
RP RP 
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Proof. By Theorem 6.3 we can find a sequence {@,} of step maps with 
respect to finite unions of rectangles which converges both L* and almost 
everywhere to f. If @ is a step map as above, it is clear that its integral 
is the same as the integral of a translation t,@, because if R is a rectan- 
gle, then 


4(R) = p(t, R). 


But {t,@,} converges almost everywhere to t,f, and by the preceding 
remark, {t,@,} is L'-Cauchy, whence is also L'-convergent to t,f. Our 
theorem follows at once. 


Theorem 9.8. If Y is a measurable set in R?, then we have: 


u(Y)=inf w(U) for U open, UnY, 
u(Y) = sup w(K) for K compact, Kc Y. 


Thus if Y has finite measure, given ¢ there exists an open set U and a 
compact set K such that 


KcYcU and u(U — K) <e. 


Proof. The statement concerning open sets is clear by applying the 
definition of our measure as an application of the Hahn theorem, giving 
uu as the outer measure with respect to bounded rectangles. We can 
always take the rectangles to be open to cover Y, since a closed rectangle 
is contained in an open one whose measure is at most ¢é/2” bigger. 
Concerning the statement about compact sets, suppose first that Y is 
bounded, say contained in a closed bounded rectangle R. We find an 
open set U containing R — Y such that 


u(U) < wW(R— Y) +6. 


Let K = RA@U =R-—U. Then K is compact and contained in Y. We 
have trivially: 


MK) S u(Y) = w(R) — W(R — Y) 
S w(R) — w(U) + € 
S w(K) + «. 


This proves our assertion when Y is bounded. The general case follows 
at once by considering the intersections of Y with a sequence {R,} of 
rectangles such that R, < R,,, for all n, and such that the union of the 
R,, 1s the entire euclidean space. 
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Vi, §10. EXERCISES 


Unless otherwise specified, (X, M, 4) is a measured space. 


1. (a) Let .@ be a o-algebra in a set X, and let 
f{:X ~Y and g.YoZ 
be mappings. Show that 


(go fal) = 9x( Sy (M)). 


In other words, (g ° f)y = Gx ° Sy: 

(b) Let .W be a o-algebra in a set X, and let yw be a positive measure. Let 
f:X—Y be a mapping. Define the direct image f,u on f,.@ by the 
condition 


(f,.4)(B) = u(f~*(B)) 


for all B in f,.@ Show that f, is a positive measure. 


2. Egoroff’s theorem. Assume that p is o-finite. Let f: X > E be a map and 
assume that f is the pointwise limit of a sequence of simple maps {@,}. 
Given ¢, show that there exists a set Z with yu(Z) <e« such that the conver- 
gence of {g,} is uniform on the complement of Z. [Hint: Assume first that 
u(X) is finite. Let A, be the set where |f| =k. The intersection of all A, is 
empty so their measures tend to 0. Excluding a set of small measure, you 
can assume that f is bounded, in which case f is in #'(y) and you can use 
the fundamental lemma of integration, or Theorem 5.2. ] 


3. Let {f,} be a sequence of measurable functions. Show that the set of those x 
such that { f,(x)} converges is a measurable set. 


4. Let {a,} be a sequence in [—o0, 0]. View [—o0, 00] as a toplogical space, 
neighborhoods of —oo being given by sets [—0o, a) for a real, and similarly 
for neighborhoods of oo. Let {a,} be a sequence in [—oo, oo]. By 


lim sup a, 


we mean the least upper bound of all points of accumulations of the sequence 
{a,}. We allow —oo and +00 as points of accumulation, taking the obvious 
ordering in [—0oo, 0] where 


—-O <a<® 


for all real a. 

(a) Let b = limsupa,. Suppose that b is a number. Show that given e, there 
exists only a finite number of n such that a,>b+e, and there exist 
infinitely many n such that a, > b — «. Prove that this property character- 
izes the limsup. Give a similar characterization when b = oo. 

(b) Charaterize lim inf similarly, and show that 


lim inf a, = —lim sup(—a,). 
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(c) 


(d) 


A sequence {a,} in [—oo, oo] converges if and only if 
lim sup a, = liminf a,. 
If {f,} is a sequence of measurable maps of X into [—0, oo], then its 


upper limit and lower limit are measurable. (By the way, the limsup and 
lim inf of the sequence { f,} are defined pointwise.) 


5. Positive measurable maps. A map f: X — [0, co] will be called positive. 


(a) 


(b) 


(c) 


(d) 


If f, g: X — [0, co] are measurable, show that f + g, fg are measurable. If 
{f,} is a sequence of positive measurable maps, show that sup f, and 
inf f, are also measurable. 

If u is o-finite, show that f is measurable if and only if f is the limit of 
an increasing sequence of real valued step functions (0 outside a set of 
finite measure). 

For a positive measurable map f: X > [0, co] let {f,} be a sequence of 
positive simple functions (real valued) which is increasing to f. If the 


integrals 
| In Ap 
X 


exist and are bounded (so in particular each f, is 0 outside a set of finite 
measure), define the integral of f to be their least upper bound, and if 
unbounded, define the integral of f to be oo. Show that this is well 
defined, ic. independent of the sequence { f,} increasing to f. Formulate 
and prove the monotone convergence theorem in this context. Note: In- 
stead of redoing integration theory, you can quote results from the text to 
shorten the procedure. 

For each measurable A and positive measurable map f: X > [0, o] 
define 


My(A) = ) f dp. 


Show that yu, is a positive measure on X. If g: X > [0, oo] is measurable, 


show that 
| adn, = | Ig du. 
xX xX 


6. Let {f,} be a sequence of continuous functions on [0,1] such thatO Sf, <1 
and such that { f,(x)} converges to 0 for every x in [0, 1]. Show that 


1 
im | f, du =0, 
O 


where yp is Lebesgue measure. 


7. Completion of a measure. 


(a) 


Let .@ consist of all subsets Y of X which differ from an element of .4 
by a set contained in a set of measure 0. In other words, there exists a 
set A in @ such that (Y — A) U(A — Y) is contained in a set of measure 
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10. 


0. Show that @ is a o-algebra. If we define ~(Y) = (A) for Y, A as 
above, show that this is well defined on .4, and that ff is a measure 
on &. We call (X,.4,f) the complete measure space determined by 
(X, 4, pw), and we call ~ the completion of y. 

(b) Let (X;, .4;, u;) (i = 1, 2, 3) be measured spaces. Show that 


(4, © M,)®@ Ms; = M, © (4, © Ms), 
(Hy ® pz) @ Ws = fy @ (Hz @ Hs). 


If (X, 4, pw) and (Y, , v) are measured spaces, show that 


MON=MON and [L@V=pOv. 


. (a) Direct image of a measure. Let (X,.@) and (Y, /) be measurable spaces. 


Let f: X + Y be a map such that for each Be VW we have f-'(B)e.%. 
Let yu be a positive measure on .4, and let yw, = f,u be defined on V 
by u,(B) = u(f7'(B)). Show that py, is a measure. Show that if g is in 
L'(p,), then go f is in #'(y), and that 


| gofau= | g du,. 
X Y 


(b) Let X, Y be topological spaces and f: X — Y a homeomorphism. Show 
that f induces a bijective map 


f*: BY) > BX) 


where # denotes the Borel algebra. 


. Let E be a Hilbert space with countable base. A map f: X —E is called 


weakly measurable if for every functional 4 on E the composite Ao f is 
measurable. Let f, g: X —~ E be weakly measurable. Show that the map 


xt> f(x), g(x)? 


is measurable. [Hint: Write the maps in terms of their component functions 
with respect to a Hilbert basis, so the scalar product becomes a limit of 
measurable functions. ] 


Monotone families. 
(a) A collection Y of subsets of X is said to be monotone if, whenever {A,} is 
an increasing (resp. decreasing) sequence of subsets in /, then 


(JA, (resp. () A,) 


also lies in SY. Let ow be an algebra of subsets of X. Show that there 
exists a smallest monotone collection of subsets of X containing #%. De- 
note it by “. If xe X, show that VW is a o-algebra, and is thus the 
smallest c-algebra containing ». [Hint: For each Ae JN, let W(A) be 
the collection of all sets Be VW such that BUA, B—A and A —B lie in 
MN. Then (A) is monotone. } 
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11. 


12. 


13. 


14. 


(b) Assume that X eV. Let uw be a positive measure on .. Show that an 
extension of pz to a positive measure on VW is uniquely determined, by 
proving: If u,, “, are extensions of uw to WV, then the collection of subsets 
Y such that ,(Y) = u,(Y) is monotone. 


Let (X, @, uw) and (Y, ”,v) be measures spaces. If QE W@W, show that 
the map 


xr v(Q,) 


is measurable (with respect to .@). [Hint: Show that the set of Q in Z@WNV 
having the above property is a monotone family containing the rectangles. ] 


Show that if c, is the (Lebesgue) measure of the closed n-ball in R” of radius 
1, centered at the origin, then 


n/2 
Ch =Ch-1 | cos" t dt, 
~n/2 
and therefore 


gl? 


Cc, =—-——. 
n 

lj-~+1 

G+!) 


Let T be a metric space and let f be a map on X x T such that for each 
t € T the partial map 


fii xr f(x, d 


is in Y'. Assume that for each x the map tr f(x, t) is continuous. Finally 
assume that there is some ge #'(y, R) such that | f(x, t)| S |g(x)| for all x. 
Show that the function ® given by 


O(t) = | f(x, t) du(x) 
xX 
is continuous. 


Differentiating under the integral sign. Let T be open in some euclidean 

space. Let f be a map on X x T satisfying: 

(a) For each t the map x f(x, t) is in #?. 

(b) For each x, the map f,: tr f(x, t) is differentiable, and its derivative is 
continuous in t. 

(c) The second partial D,f(x,t) is in Y' for each t, and there exists an 
element g € #'(u, R) and g = 0 such that 


|D, f(x, t)| S g(x) 
for all x, t. 
Then the map © as in the preceding exercise is differentiable, and its deriva- 
tive is given by 


DO(t) = | D, f(x, t) du(x). 
x 


(If you prefer, take T to be an open interval.) 
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15. Let « be Lebesgue measure on R?. If f,ge Y'(p), define f*g by 


f * g(x) = {, f(t)g(x — t) dy(t). 


(a) Show that fxg e #'(p) and that ||f*gll, < lfllillgis. We call f+*g the 
convolution of f and g. 

(b) Show that convolution is commutative, associative, bilinear, and that 
L'(u) is therefore a Banach algebra. Does there exist a unit element in 
this algebra? 


16. Let M be the set of all finite positive Borel measures on R’. For each we M 
define || = n(R’). For yp, ve M, and any Borel subset A of R? define 


(u* v)(A) = (u @ v)(o?(A)) 


where o: R? x R? > R? is the sum, that is o(x, y)= x+y. 

(a) Show that o~'(A) is a Borel set in R? x R?. 

(b) Show that w*veM and that |w*v| S| |vI. 

(c) Show that p*v is the unique positive Borel measure t such that 


| fdt= | f(x + y) dv(y) du(x) 


for every step function f with respect to rectangles. 

(d) The operation (1, v)- > *v is called convolution. Show that it is commu- 
tative, associative, and bilinear. 

(e) Show that there exists a unit element in M, i.e. an element 6 such that 
O*u = w*d = p for all we M. 

(f) Let yw be Lebesgue measure, and let f, ge Y'(u). Show that 


Ly * Ug = Lf xg: 


(g) After you have read about complex measures in the next chapter, show 
that all the previous properties apply as well to such measures, and that 
these measures therefore form a Banach algebra under convolution. 


17. Let X =[—2,7], and let p be Lebesgue measure. Let fe #'(u,C). Show 
that one can define the Fourier coefficients of f in the usual way, by 


1 [” . 
C, == | f(x)je"""™* dx. 
20 Jn 


If c, = 0 for all integers n, show that f is equal to 0 almost everywhere. 


18. Riemann—Lebesgue lemma. Let f ¢ #'(R). Prove that 


lim | f(x)e* dx = 0. 
R 


to 
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[Hint: Approximate f by a C® function with compact support, in which case 
integrate by parts. | 


19. (a) Let R* be the multiplicative group of non-zero real numbers. Show that 
the map 


dt 
Yro {, vor 


for w a step function with respect to intervals not containing 0, defines a 
positive Borel measure on R*. We denote this measure by y*. Show that 
a function f is in &'(u*) if and only if f(x)/|x| is in L'(u), where p is 
Lebesgue measure, and that in this case, 


| f dp* = | f(x) |x|7* dx. 
R* R-{0} 


J 


(b) Show p* is invariant under multiplicative translations, and so is the inte- 
gral on R* with respect to p*. (Multiplicative translations are of type 
xt ax for a # 0.) 


20. Not all sets are measurable. Consider the reals modulo the rational numbers, 
and in each coset x + Q, x real, select an element y such that O< y<l. 
Show that the set consisting of all such elements cannot be Lebesgue measur- 
able. [Hint: Use the countable additivity to show that this set cannot be 
measurable. } 


21. Let X be a measured space with finite measure p(X). Let fe #*(y). Com- 
pute the limit 


lim | f(x)" du(x). 


n>o JX 


22. Arbitrary products. Let (X,, .-@,, u,) be a family of measured spaces such that 
u,(X,) = 1 for almost all n (meaning all but a finite number of n). Let 


x =]]X, 


be the product space. Let .@ be the o-algebra generated by all sets of the 
form 


A=T]A,. 


where A, is measurable in X,, and A, = X, for almost all n. Then (X, .@) is 

a measurable space. A set A as above is called decomposable. 

(a) Show that there exists a unique measure p on (X,.@) such that for every 
decomposable set as above, we have 


(A) = [| u,(A,)- 


(b) Let f,e #'(u,) and assume that f, is the characteristic function of X,, for 
almost all n. Show that the product function f = @f, is in Y*(u), and 
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that 


| £4u=T1| fed 


Note: Do this exercise first for finite products. 


23. Dini’s theorem. The dominated convergence therem is useful in almost all 
instances, but sometimes, one wants a more delicate criterion for the inter- 
change of a limit and an integral. This may be provided by Dini’s theorem, 
as follows. 


Dini’s Theorem. Let {F,} be a sequence of functions on [a, 00) such that for 
all n we have F,¢€ L*([{a, B]) for all B= a. Let 


1,(x) = I F.. 


Assume : 

(a) The sequence {F,} converges pointwise to a function F, uniformly on each 
finite interval [a, B]. 

(b) The sequence {I,} converges uniformly on [a, 00). 

(c) The improper integrals 


a) B 20 B 
| F = lim | F. and | F = lim | F 
a B>o Ja a Bo Ja 


exist. 
Then 


Proof. Given €, we consider 


00 B B ro) 
(1) | -F)=| —F)+ | 7+ | (F, — F). 
a a a B 
By assumption (b) there exists nj such that for all m, n 2 no we have 
B 
| (F, — F,)| <6 for all B. 


Let n= np. By assumption (c), there is B(n) and B(co) such that 


) F, | F 
B B 
We now pick B= max(B(n), B(oo)). Then the third integral on the right of 


(1) is bounded by 2«. Using (a) we select m sufficiently large so that the 
second integral in (1) is bounded by e. This concludes the proof. 


<eé for B= B(n) and <e for B2 B(oo). 
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24. 


25. 


26. 


Carathéodory’s criterion. Assume that M is the o-algebra of all y-measurable 
subsets of X in the sense of §7. Let f: X ~ Y be a map of X into a metric 
space Y. Prove that f is measurable if and only if for every subset Z of X 
and every two subsets B, C of Y satisfying 


dist(B, C) > 0, 
we have 


(*) u(Z) 2 u(Zaf(B)) + U(Zaf-(O)). 


[ Hint: One direction is obvious. Conversely, assume (*). Let A be closed in 
Y. Let: 


i 1 1 
B, = ive Yea 5 dist, ast 
m+1 m 
1 
Ch, = ive Y|dist(y, A) > as 


1 
B= . é€ Y|0 < dist(y, A) S$ ~t 
Then C,,U B,, = @A. Prove that for any subset Z of X we have 


(Za f-(A)) + WZ — fA) S we) + eZ 0 f-(Bi,)). 


Then show that 
lim u(Z af" B,) =0 


m+ oo 


by considering the sums ) u(Z 0 f~'B,) for k even and k odd, and applying 
the hypothesis (x).] 


Let X be a metric space and ¥ a family of subsets of X whose union covers 
X. Let 
go: F >Rvu{o} 


be a non-negative function. For every c>0 and Ac X, let 


y(A) = inf Y @(F), 
G@ FeG 


where & is a family of ¥ such that: 
(i) The union of the elements of Y covers A. 
(ii) If Fe Y, then diam F <c. 


(a) Prove that y, is an outer measure on X. Define y(A) = lim,_.o 4,(A). 
(b) Prove that is an outer measure, called the Carathéodory measure asso- 
ciated with (9, F). 


Let pu be the Caratheodory measure associated with (9, F). 
(a) Prove that the open sets of X are p-measurable. (Use Carathéodory’s 
criterion applied to the identity map.) 
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(b) If all elements of F are Borel sets, prove that for any subset A of X we 
have 


p(A) = inf y(B), 


the inf being taken for all Borel sets B containing A. 


Examples. For F < R™ let 
o(F) = v,,2 "(diam F)", 
where v,, is the volume of the m-dimensional ball in R”, and let F be the 


family of all open sets of R”. The associated Carathéodory measure is called 
the m-dimensional Hausdorff measure #7”. 


CHAPTER VII 


Duality and Representation 
Theorems 


Throughout this chapter (X, 4, 1) is a measured space. 


Vil, §1. THE HILBERT SPACE L?(1) 


Consider first complex valued functions. We let Y7?(u) be the set of all 
functions f on X that are limits almost everywhere of a sequence of step 
functions (i.e. u-measurable), and such that | f|? lies in Y*. Thus 


If? = ff 


If we wish to consider a Hilbert space E instead of C, we let F?(p, E) 
be the set of all maps f: X ~ E that are limits almost everywhere of a 
sequence of step maps, and such that | f|? lies in Y!. There is no change 
from the preceding definition. In this case, 


fl =<fhP. 


the value of <f,g> at x being given by the scalar product < f(x), g(x)> 
in E. | 

The reader interested only in the complex numbers can take E = C 
and the product to be <f, g> = fg, where the bar denotes complex conju- 
gation. Not a single proof, however, will be made shorter or simpler. 


Theorem 1.1. The set L?(u) is a vector space. If f, ge L(y), then 
<f,g> is in L*(p), and the map 


(f, ayn | <fg> du 
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is a positive hermitian product on Y£7(u) (not necessarily positive 
definite). 


Proof. The map <f, g> is obviously a limit almost everywhere of step 
maps, and we have 


21K OI SIF? + lal. 


Thus the absolute value is bounded by a function in #', whence by 
Corollary 5.9 of the dominated convergence theorem, Chapter 11, it 
follows that <f,g> is in #'. As for the fact that Y* is a vector space, 
let f, ge L*. We have 


If +a? SIF? + 21h oI + I9l?, 


whence the same reference shows that f+ge #7’. It is clear that if a is 
a number, then af is in Y?, so Y* is a vector space. The last assertion 
is now obvious. 


We denote our hermitian product by 


Cf Du = [ Cf, g> du. 


We have the usual properties, like the Schwarz inequality. The L?- 
seminorm is defined by 


fle = <h Sd”. 


Corollary 1.2. We have ||f ||, =0 if and only if f is equal to 0 almost 
everywhere. 


Proof. This is really a statement about |f|*, which is in Y*, and we 
know this result already. 


Corollary 1.3. If X has finite measure and f € L(y), then actually f is 
in L*(u) and |f ll, S fila xlle- 


Proof. We apply the theorem and the Schwarz inequality to the pair 
|f| and 1, (the constant 1 on X). 


We can form the space L?(y) of equivalence classes of maps in ¥7, 
differing only on a set of measure 0. We see that the hermitian product is 
positive definite on L?(). 


Theorem 1.4. Let {f,} be an L?-Cauchy sequence in Y*. Then there 
exists some f in L? having the following properties: 
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(i) The sequence {f,} is L?-convergent to f, so that LY? is complete, 
and L?(y) is a Hilbert space. 


There exists a subsequence of { f,} having the following properties: 


(11) This subsequence converges almost everywhere to f. 
(111) Given ¢€, there exists a set Z with u(Z)<« such that the conver- 
gence of this subsequence is uniform on the complement of Z. 


Proof. As before, we really prove these statements in reverse order. 
We may assume all f, measurable. Taking a subsequence if necessary we 
may assume that for m 2 n we have 


1 
In — Sul < a5: 


We let Y, be the set of x e X such that 


1 
Inti) — Sal? 2 55. 


Then Y, has finite measure, and the proof of Lemma 3.1 in the preceding 
chapter goes through as before. We have p(Y,) < 1/2", and we let 


Z, = WU hae. 


If x € Z,, then for k 2 n we have 


1 
Str (x) — f,x)|? < ak 
so that the series 


fit > Serr — fy 


converges uniformly and absolutely on the complement of Z, for each n, 
whence pointwise and absolutely on the complement of Z (intersection of 
all Z,,). This already proves (ii) and (iii). 

Let f(x) be the limit of f(x) as noo if x€Z, and let f(x) =0 
if xe Z. There remains to prove that f is in Y? and that {f,} is 
L?-convergent to f. The expression 


{ Sn ~~ Sal” du 


is the L'-seminorm of | f, — f,,|?. We fix m and take the limit as n> oo. 
We can apply Fatou’s lemma, and conclude that |f— f,|? is in #’, 
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whence f —f,, is in 7. Since LY? is a vector space, and f,,¢ LY”, we 
conclude that fe Y?. Fatou’s lemma also shows that 


[ If —fnl? du S lim int | Ife — Sl” dp, 


no k2n 


so that for large m we see that || f — f,,||, is small, ic. the sequence { f,,} 
is L?-convergent to f. This proves our theorem. 


Corollary 1.5. If {f,} is an L?-Cauchy sequence in £* and if {f,} 
converges almost everywhere to a map f, then f is in LY? and {f,} is 
also L?-convergent to f. 


Proof. Obvious. 


Theorem 1.6 (Dominated Convergence Theorem for L7). Let {f,} be 
a sequence in L? which converges pointwise almost everywhere to f. 
Assume that there exists g€ L7(u,R) such that g 20 and such that 
\f,| << g. Then f is in L? and { f,\ is L?-convergent to f. 


Proof. The proof is essentially the same as in the #' case. For each 
positive integer k let 


g, = sup ltn — Sml- 


m,n 2k 


Then {g,} is a decreasing sequence of real valued functions, and for m, 
n=k we have |f, — f,,| < 2g. Therefore by Corollary 5.9 of the mono- 
tone convergence theorem (Chapter VI) it follows that 


ge = sup If, — fal? 


m,n =k 


is in Y'. By the monotone convergence theorem and the hypothesis, the 
sequence {g,} converges almost everywhere to 0. Hence {/f,} is actually 
an L?-Cauchy sequence, and we can apply Corollary 1.5 to conclude the 
proof. 


Corollary 1.7. The step maps are dense in £?. 


Proof. Let {g,} be a sequence of step maps converging pointwise to 
an element f of Y?. Then f is measurable. Define 


Wal) = P(x) if PaQ<)] S 21FOO, 
W(x) = 0 if |p,(x)| > 21f()|- 
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Then y, is a step map for each n, and the sequence {yw,} converges 
pointwise to f. Furthermore, |w,| <2|f| for all n. The theorem shows 
that {y,} is L?-convergent to f. Any element of Y? is equivalent to one 
for which you can find a sequence {g,} as above. Hence our corollary is 
proved. 


Vil, §2. DUALITY BETWEEN L‘(u) AND L(x) 


As Corollary 5.11 of the dominated convergence theorem, in Chapter VI 
we found that if fe Y' and g is a bounded p-measurable function, then 
fg is in Y'*. We now investigate this property more closely. Half of 
what we do in this section will be valid in Hilbert space without changing 
the proofs at all, but again the reader who wishes to understand every- 
thing in terms of complex or real valued functions is welcome to do so 
throughout. , 

We could put the sup norm on the space of step maps, but it is 
convenient to adjust this norm in terms of the given measure p, and thus 
define what is called the essential sup, as well as the completion of the 
space of step maps under this seminorm. We define #*(yu) to be the 
vector space of maps f such that there exists a bounded u-measurable g 
equal to f almost everywhere. Properties relating to the integral with 
respect to uw hold for equivalence classes of such maps. Therefore, if 
fe L~(p) it is natural of define its essential sup to be 


ess sup(f) = |[fll. = inf lig|. 
g 


where || || is the sup norm, and the inf is taken over all bounded 
u-measurable maps g equal to f almost everywhere. Alternatively, for 
each c= 0 let S, be the set of all x such that | f(x)| 2c. We could have 
defined || f||,, by the condition: 


fll. = inf of the set of all numbers c such that p(S,) = 0. 


The equivalence between the two conditions is immediately verified. (For 
instance if c>b and yp(S,)>0, then |f(x)| 2c for all x in a set of 
measure > QO, so that for all g equivalent to f we must have ||g|| 2c 
also. This proves that c <b. The reverse inequality is equally clear.) We 
also see at once that || |,, is a seminorm on #%(y). By definition, the 
set of x such that | f(x)| > ||/||,, has measure 0. 

If fe F*(p), it is clear that we have || /||,, = 0 if and only if f is equal 
to 0 almost everywhere. Consequently, we can form the space L®(y) of 
equivalence classes of elements of #*(yu), and we shall see in a moment 
that L“() is a Banach space. 
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Theorem 2.1. 


(i) The space #~(u) is complete. If {f,} is an L®-Cauchy sequence in 
L~(u), then there exists a set Z of measure 0 such that the conver- 
gence of {f,} is uniform on the complement of Z. 

(ii) If E is finite dimensional, then the simple maps are dense in 
L”(p, E). 

(iii) If p(X) is finite, then given ¢ and fe L°(p), there exists a step 
map @ and a set Z with u(Z) < € such that 


lf-—ol<e on the complement of Z. 


Proof. To prove the first statement, let {f,} be an L®-Cauchy se- 
quence in #*(). Let Z be the set of all x such that we have 


|FnlX)1 > I fall oo 
| ful) ~ Sn(X)| > In ~ Fal oo 


Or 


for some n, or some pair m, n. Then Z has measure 0, and the conver- 
gence of the sequence is uniform on the complement of Z. We let f have 
value 0 in Z and be the uniform limit of the sequence {f,} on the 
complement of Z. Then f¢ Y%(), and clearly is the L® limit of {f,}. 

Now assume that E is finite dimensional, or say equal to the complex 
numbers for concreteness. Let fe Y*(p,C). After replacing f by an 
equivalent function, we may assume that f is measurable and bounded. 
Say the values of f are contained in a square. We cut up the square into 
small s-squares which are disjoint, and take their inverse images in X. 
These give a partition of X and we can define a simple function with 
respect to this partition by giving the function any one of its values in a 
givén square. To get our small squares, let, say, e,, e, be the standard 
unit vectors in C = R?, and let S be the square 


te, + ue, 
with O<t<eandOSu<e. The translates 
S + nee, + mee, =S,, » 
with integers m, n, are disjoint. If N is large and we take 
—N<n<N and —-N<méAsvn, 
then our small squares S,,,, cover the image of f as desired. The argu- 


ment also works in any finite dimensional space, taking unit vectors 
€,,-..,€, in R?. 
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Finally, for the third part of the theorem, if yw(X) is finite, then any 
element of #%(u) is in F'(p), and our assertion follows from the fact 
that elements of #' are L’-limits of step maps, together with the funda- 
mental lemma of integration, or Theorem 5.2 of Chapter VI. 


Remark. We phrased the density of (11) in terms of simple maps. 
Recall that step maps are assumed to be equal to 0 outside a set of finite 
measure. Thus the step maps cannot possibly be dense in L(y) if u(X) is 
infinite, since the constant function 1 cannot be uniformly approximated 
by step functions in that case. If we restrict our attention to the case 
when p(X) is finite, then step maps and simple maps coincide. In applica- 
tions this suffices, since one deals mostly with o-finite measures, and 
certain problems can be reduced to the case of finite measures. The 
density statement of (iii) is also useful in the infinite dimensional case. 


Consider now the case of functions (complex valued, say). We have a 
bilinear map 


L*(u, C) x Lp, C)>C 
given by 


(f, ayes | fodu= (fi 9)u 


This arises from Corollary 5.11 of the dominated convergence theorem 
(Chapter VI). It is clear that the value of this map on (f, g) depends only 
on the equivalence class of f and g, respectively, and thus defines a 
bilinear map 

L*(p) x L*(w) > C. 


Without changing anything above except the notation slightly, if we 
write <f,g> instead of fg, and take the values of f, g in a Hilbert space 
E, then what we said holds, except that as usual, the map is not bilinear 
but sesquilinear (i.e. linear in its first variable, but anti-linear in its 
second variable, that is a complex conjugation occurs when we multiply 
g by a constant). For the convenience of the reader, we shall state our 
results first for functions, and then for the Hilbert case. There will be 
absolutely no difference in the proofs except for this change between fg 


and <f,g>. 
Quite generally, let 


t#FxGoH 


be a bilinear map of vector spaces into another vector space. If ve F 
and we G we write v | w and say that v is orthogonal to w if t(v, w) = 0. 
We define the kernel on the left to consist of all ve F such that v LG, 
ie. v is orthogonal to all elements of G, and similarly we define the 
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kernel on the right. These kernels are clearly subspaces of F and G, 
respectively. We say that the bilinear map is non-degenerate if the kernels 
on the left and right are equal to 0. Suppose that F, G are normed 
vector spaces (or semi-normed) and that the bilinear map is continuous. 
In applications, the condition 


[t(v, w)| S |v] |w] 


is even satisfied. Then we obtain corresponding mappings of F and G 
into each other’s dual spaces, namely each ve F gives rise to the func- 
tional 4, € G’ given by 

A,(W) = T(v, W). 


Similarly, each we G gives rise to the function 4, in F’ given by 
A,(v) = t(v, w). 


We investigate this situation when we deal with the spaces L'(y) and 
L(y). 


Theorem 2.2. Let p be o-finite. The kernels on the right and left of the 
bilinear map 
L*(u) x L*(u) > C 


are Q. This map satisfies the product inequality 


Cf; 91,1 S Ifa SWF lilgll..- 


The maps g>A, and fd, forge L°(u) and fe L*(u) induce norm- 
preserving linear maps of L®(u) and L'(), respectively, into the other’s 
dual space. In the case of L™(u), the map gt~4A, is a norm-preserving 
isomorphism between L™() and the dual space of L*(), i.e. the map is 
surjective. 


Proof. Let fe Z'(p) be orthogonal to Y°(u). Then f is 0 almost 
everywhere by Corollary 5.19 of Chapter VI (the averaging theorem). 
The other side works similarly as follows. If g is bounded p-measurable, 
then for every measurable subset A of finite measure, the map g, = gy, 1S 
in Y'. We can therefore apply the same argument, and see that g, is 0 
almost everywhere, whence g is 0 almost everywhere since yp is assumed 
o-finite. 

Let C be a bound for g. Then 


Ifgll = I. lfgldus c| Ifldu=Cllflli- 
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This implies our inequality 


IC 9dul S Walls S Willi ligt: 


and shows that |/,| < |lgll,. For the reverse, let b=|A,|. For each 
subset A of finite measure, we have 


| LA9 i < bp(A). 
X 


By Corollary 5.18 of the averaging theorem of Chapter VI, we conclude 
that |g(x)| S b for almost all x, whence |lg||,, < b. Therefore |A,| = |lgllw. 

Now on the other side, let fe #'(u), and define 1/|f| to be the map 
having value 1/|f(x)| if f(x)#0O and 0 if f(x)=0. Then 1/|f| is pe 
measurable, and f/|f| is u-measurable and bounded. Let g = f/|f|. Then 
from ||g||,, = 1 or 0, we get 


Ila = [, fg dp= 4A,(g) < gle 


This proves the reverse inequality, whence || ||, = |4,l. 

This proves all our statements except the last, that L“(u) provides 
us with all functionals on L'(y). To see this, we give the argument of 
von Neumann (originally applied to the Radon—Nikodym theorem, see 
below). Assume first that X has finite measure. Let 4: L’(u)>C be a 
functional, and let b be its norm so that we have 


IAf| Sols lls 


for all fe Z'(u). The functional 4 can actually be viewed as defined on 
L7(u) because any map g in Y? on a set of finite measure X is in Y’ 
(use the Schwarz inequality on the pair |g| and the function 1,). Thus 
we obtain 


IAf| S bf lly, = | fl: ly du S ||Filell1xll- 
x 


This shows that / is continuous with respect to the L?-seminorm. Since 
L?(u) is a Hilbert space, there exists g € Y7(pu) such that we have 


=| fod 


for all step maps f. For any measurable set A of finite measure, we then 
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obtain 


i XA9 a = |A(xa)I S Ollzalla S bu (A). 
»¢ 


By Corollary 5.18 of the averaging theorem (Theorem 5.15, Chapter VJ), 
it follows that |g(x)| < b for almost all x, whence g is in fact in #°(y) 
and |lg||,, <b. Since 


g=(h Gl, 


for all step maps f, this same relation must hold true for all fe Z'(y) 
because the step maps are dense in Y'. This proves our last assertion 
when X has finite measure. 

The general case when wp is o-finite follows easily. We write X as a 
disjoint union of sets of finite measure X, (k = 1,2,...). Let fe L'(w), 
and let f, = fy, be the same as f on X, and 0 outside X,. Then the 
series 


Ms 


Si 


k 


1 


is L’-convergent to f, say by the dominated convergence theorem, and 
therefore by the continuity of A we have 


Af = ), Mf): 
k=1 
For each k there exists a y-measurable map g, on X,, bounded by b, and 
0 outside X, such that 
Afi) = | SeQn Ap. 
X 
We let 


g=)> % 


k=1 


(pointwise). Then g is bounded by b, y-measurable, and it is clear that 
4 = A,, thus concluding the proof of our theorem. 


We now repeat the statement of Theorem 2.2 for the Hilbert case. 


Theorem 2.3 (Hilbert Case). Assume that wu is o-finite. Let E be a 
Hilbert space. We have a sesquilinear map 


L*(p, E) x L°(p, E) > C 


defined for fe L'(u, E) and ge L(y, E) by 


(Arf Oy = { <f, g> du. 
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The kernels on both sides are 0. The map gtd, induces a norm- 
preserving linear map of L®(, E) onto the dual of L*(p, E) (so the map 
is surjective), and the map ft+A, induces a norm-preserving antilinear 
map of L*(p, E) into the dual of L®(p, E) (not necessarily surjective). 


Proof. Exactly the same, except that when for instance we considered 


| Xag i < by(A) 
xX 


in the proof of Theorem 2.2, we now have to write <ex,,g> for some 
unit vector ee E, and apply Corollary 5.20 instead of Corollary 5.18 of 
the averaging theorem. 


We wish to characterize those elements of the dual of L°(u) which can 
be represented by some element in #'(u). Over the complex numbers, 
the classical Radon—Nikodym theorem achieves this purpose, and can be 
viewed as stating that if a functional on L*(y) can be represented by a 
finite measure, then it already can be represented by a function. We first 
make some comments in this case. 

Let v be a positive measure on .@. We say that v is absolutely 
continuous with respect to » or p-absolutely continuous, if we have v(A) = 
0 whenever p(A) = 0. (Cf. Exercise 1.) We say that a functional 4 on 
L®() can be represented by a positive measure v if the functional has the 
form 


A: o| g dv. 
X 


We then write 1 = dv. If the functional can be so represented, then v is 
necessarily absolutely continuous with respect to pm, because the func- 
tional vanishes on characteristic functions of measurable sets A such that 
u(A) = 0. The Radon—Nikodym theorem in its classical form states: 


If v is a finite positive measure on M which is p-absolutely continuous, 
then there exists some f € £*(p) such that for all AE M we have 


v(A) = | tau 


This measure is conveniently denoted by py. 

A functional dv can be viewed as functional on various spaces (e.g. 
spaces of step maps, L™(), etc.). We shall always make it explicit on 
which space we intend this functional to be. We observe that a func- 
tional on L®(u) represented by a map in #' is determined by its values 
on step maps. Thus actually we can limit our attention to step maps. 
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But it is reasonable also to ask for functionals on simple maps, without 
any reference to the measure yp, and with continuity with respect to the 
sup norm. For this purpose, we need another definition. 

A positive measure v on .@ is said to be concentrated or carried in a 
measurable set A if v(Y) = 0 for all Y in the complement of A. If v,, v, 
are two positive measures on .@ we say that they are orthogonal or 
singular to each other and write v, 1 v, if there exists a decomposition 


X=AUB 


of X into a disjoint union of measurable sets such that v, is concentrated 
in A and vy, is concentrated in B. 

Let #~(.W, C) denote the space of bounded measurable functions on 
X. We make no reference to any measure here at all, and we take the 
sup norm on this space. Let v be a finite positive measure on 4. This 
means that v(X) < oo. Then v gives rise to a functional on Y~(.4, C) by 


the map 
a | g dv 
X 


| aa 
X 


Theorem 2.4 (Radon—Nikodym and Lebesgue). Assume that pw is o- 
finite, and let v be a finite positive measure on . Then there exists a 
unique decomposition 


satisfying the bound 


S llgll v(x). 


v=vt+y, 


as a sum of positive measures, such that v, is absolutely continuous with 
respect to pt, and v, is singular with respect to p. We have v,1y,. 
Finally, if v is absolutely continuous with respect to p, then there exists 
an element f € £*(u) such that v = p,, and f is uniquely determined up 
to equivalence. Furthermore, the functional on L™() represented by v is 
then also represented by f, i.e. we have dv = f du on L™(y). 


Proof (von Neumann). The uniqueness is essentially obvious. If we 
can write 


dv = f, du+ dv, = f, du + dvi, 
with f,, f, € Z*(u), and v,, vi singular with respect to p, then 


(fi — fo) du = dv, — dv,, 


whence f, — f, is 0 almost everywhere, and v, = y,. 
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Now for existence, we assume first that p(X) is finite. Then w+ v is a 
finite positive measure, and we consider the integral with respect to w+ v 
on £%(.4), i.e. on the bounded measurable functions. Since all sets have 
finite measure, we don’t need to specify that we deal with step functions 
vanishing outside a set of finite measure. Using the Schwarz inequality 
with respect to L?(u + v), we have for any step function g: 


| oa 
X 


where 1, is the function equal to 1 on X. Hence the map 


on | o dv 
x 


is L?(u + v)-continuous on step functions, whence it extends uniquely to 
a functional on L?(u + v). By the L? duality, there exists a function 


s| olds | |p| d(u + v) 
xX xX 


< [olla |l1xll2, 


he #*(ut+v) 


(uniquely determined up to equivalence) such that for all step functions @ 


we have 
| pav= | gh d(p + v). 
X X 


Letting @ be the characteristic function of a measurable set A, we find 
| h d(u + v) = (A) S (uw + v)(A). 
A 


By the averaging theorem (Theorem 5.15 of Chapter VI) we may assume 
without loss of generality that 0 <h < 1, and setting h equal to 0 on a 
set of (u + v)-measure 0, we may also assume that h is measurable. 

For step functions @ we have 


| oduty=| odu+ | g dv, 
X X X 


whence the same holds if @ is any bounded measurable function, by 
the dominated convergence theorem. Consequently for ge ¥%(.@) we 
have 


(1) | aav= | ghau+ | ghar. 
X X X 
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Let Y be the set of all x e X such that 0 < h(x) < 1, and let Z be the set 
of all x e X such that h(x) = 1. First let g be the characteristic function 
of Z. From (1) we see that u(Z)=0. Let g be arbitrary (bounded 
measurable) and iterate (1). By induction we obtain 


(2) [odv= [oth tit ee hry du + | ght dv 
X X X 

Take the limit as n- 0c. The dominated convergence theorem shows 
that 

[ott dv | oa as nO. 

X Z 
Let 

h 
Sah 


on Y and 0 outside Y. Since p(Z) = 0, the first integral on the right is 
really carried by Y, and taking the limit yields 


| aav= | of au + | g dv. 
X Y Z 


We define v, to be the measure obtained from v by 
v,(A) = V(A cr Z). 


We could also write v,= vz. We let v, be the measure represented by pu, 
on Y and 0 outside Y. We see that our theorem is proved in the finite 
case. 

The extension to the o-finite case follows easily as in Theorem 2.2. 
We express X as a disjoint union of measurable sets {X,} of finite 
measure, apply the finite result to each piece, and see that we get the 
expected convergence. 


The Lebesgue part is the decomposition into absolutely continuous 
and singular measures. The representation of v by f is the Radon- 
Nikodym part of the theorem. We look further into this. It is reason- 
able to expect it to hold in Hilbert space, in the sense that if a functional 
on L(y, E) can be represented by a “measure”, then it can be represented 
by some fe Y'(u). When I mentioned this to Palais, he pointed out 
to me that if one takes the right definition of measure, then the result 
follows at once from the positive case, and I am indebted to him for the 
following corollary. 

If v is a positive measure on .@, absolutely continuous with respect to 
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u, and he £'(v, E) where E is a Hilbert space, then we get a functional A 
on L®(y, E) by 


oo | <g, h> dv = A(Qg). 


It will be convenient to denote this functional by hdyv. In this case, we 
also say that 2 can be represented by a finite (£-valued) measure. This 
terminology will be justified in the next section. 


Corollary 2.5. Assume that p is o-finite. Let E be a Hilbert space and 
let v be a positive measure on M, absolutely continuous with respect to wu. 
Let he Y'(v, E). Then there exists fe L'(u, E), uniquely determined 
up to equivalence, such that h dv = f du. In other words, if a functional 
on £~(p, E) can be represented by a finite E-valued measure, then it 
can be represented by a map f in L*(y, E). 


Proof. Let 1/|h| denote the function equal to 0 at a point x such that 
h(x) = 0, and equal to 1/\h(x)| if h(x) #0. Then h/|h| is p-measurable 
and bounded, and |h| is in Y'(v, RR). Then |h| dv is a positive measure 
on .@, which is absolutely continuous with respect to yu. By the positive 
Radon—Nikodym theorem, we conclude that |h| dv =k du, where k 1s 
positive and in Y*(u). Then 


is in L'(u, E), being the product of a bounded p-measurable map and an 
element in #'(y). It is then clear that this f satisfies our requirements. 

We shall see in the next section that, in fact, we can start from the 
“measure” point of view to arrive at our functionals. 


Vil, §3. COMPLEX AND VECTORIAL MEASURES 
Let 4 be ao-algebra in X and E a Banach space. 


Instead of considering real positive valued measures, we wish to investi- 
gate complex valued measures satisfying the same countable additivity 
property. It is then clearer to start with Banach valued measures, so that 
we see clearly where the property of finite dimensionality is used for 
certain results peculiar to the complex numbers. Again, no proof would 
be made shorter if we were to assume from the start that E=C. In any 
case, finite or infinite dimensional spaces are useful in a number of 
applications. 

By a decomposition of a measurable set A, we mean a sequence {A,} 
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of disjoint measurable sets whose union is A. (We don’t use the word 
partition, which was used for a finite decomposition of a set of finite 
measure with respect to a positive measure.) A map 


vi&oE 


is called countably additive if v(@)=0, and if for every Ae. and 
every decomposition {A,} of A we have 


Ms 


y(A) = ¥° v(A,). 


n=1 


This infinite sum is to be interpreted as convergent to the same value, 
independent of the ordering of the terms. Its value is in E. We now 
consider properties of such a countably additive map. 

The limiting properties of a positive measure are again satisfied in the 
present case, namely: 


Let {Y,} be an increasing sequence of measurable sets such that ) Y= 
Y. Then | 
lim v(¥,) = v(Y). 


no 


Similarly, if {Y,} is a decreasing sequence of measurable sets, and 
Y=()Y¥,, then 
lim v(Y,) = v(Y). 


no 


The proof is obvious, as for the positive measures. We define a 
function 


|v|: @ — [0, oo] 
by letting 


(A) = sup Y, |(4,)h 


the sup being taken over all decompositions {A,‘} of A. We shall prove 
that |v| is a positive measure, and that if E = C (or is finite dimensional), 
then |v| is in fact real valued, ie. finite. 

We observe that if A < B are measurable, then 


[v|(A) S |v] (B). 


This is obvious, because if {A,} is a decomposition of A, then {A,, B— A} 
is a decomposition of B. In particular, if |v|(B) is finite, so is |v|(A). 
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Theorem 3.1. Let v:.@—E be countably additive. Then |v| is a posi- 
tive measure. | 


Proof. Let {A,} be a decomposition of Ae. Let b, be a real 
number = 0 such that b, S|v|(A,). Let {A,;} be a decomposition of A, 
such that 


Then we may view {A,,;} (n, j= 1,2,...) as a decomposition of A, and 
therefore summing over n yields 


Y by —& SY Y|WAy)! S [vl (A). 
n n j 
Taking the sup over all {b,} and letting ¢ > 0, we get 


> |vI(A,) S lvi(A). 


Conversely, let {B;} be any decomposition of A. By the countable 
additivity of v applied to the decomposition {A,B} (n= 1, 2,...) of B,, 
we get 


y |v(B;)| = y 


j J 


y V(A, -) B) 
< ¥ |v(A, 9 B)| SY |v (A,). 


This is true for all decompositions {B,+ of A, whence we get the reverse 
inequality 


\v](A) Sd Iv (A,), 
thus proving our theorem. 
The measure |v| is sometimes called the total variation of v. 


Theorem 3.2. If E is finite dimensional, and v: “—E is countably 
additive, then |v| is real valued, i.e. finite. 


Proof. The general case reduces at once to the real case (compo- 
nentwise). We deal with the real case as in Saks [Sa]. Suppose that 
|\v|(X) = 00. We first observe that there exist measurable subsets of X 
whose measures have arbitrarily large absolute values. This is seen as 
follows. We take a decomposition {X,} of X such that 


y Iv(X,)| 
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is large. We combine all those terms with indices n such that v(X,) have 
the same sign. For either + or —, the corresponding sum will be large. 
We take a finite number of such n, but sufficiently many so that the sum 
of the corresponding X,, is a subset B with |v(B)| large. All we need here 
is the finite additivity of v. 

Now we construct a decreasing sequence of subsets of X having 
measures whose absolute values tend to infinity. Let X = A,. By what 
we have just seen, there exists a subset B < A, such that 


|v(B)| 2 |v(A,)| + 2. 
If |v|(B) = oo, we let A, = B. If |v|(B) is finite, then 
|v|(A, — B) = © 
and we let A, = A, — B. Then 
|v(A2)| 2 |v(B)] — |v(Ay)] 2 2. 


It is clear that we could have replaced 2 by any number. Repeating the 
procedure inductively, we get a decreasing sequence 


A,2A,24A;3>°°° 


such that |v(A,)| 2n. Let A=()\A,. The countable additivity of v now 
yields a contradiction, because 


v(A) = lim v(A,). 
This proves our theorem. 


Example. The following is an example in which the conclusion of 
Theorem 3.2 fails. It is already in a paper of Birkhoff (Trans. Amer. 
Math. Soc., 38 (1935) pp. 357-378). Let E=I? be the space of se- 
quences {a,} of (say) real numbers such that )'a, converges, with the 
standard scalar product. Then E is a Hilbert space. Let uw be Lebesgue 
measure on the line. For each positive integer n and measurable set A 
let 


1 
vn(A) =~ (AO [Ln — 1, n)). 


Let v(A) be the sequence whose n-th term is v,(A). It is clear that the 
total variation of v is infinite and that v is countably additive on the 
positive line X consisting of all real numbers 2 0. 
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By a vectorial measure on .@ we shall mean a countably additive map 
vi MoE 


such that |v|(X) is finite, i.e. such that |v| is a real valued positive 
measure. [Recall that if A < B, then |v|(A) <|v|(B).] For simplicity, we 
also call a vectorial measure a measure, and when we have to make a 
distinction with the objects discussed in Chapter VI or the preceding 
sections, we emphasize this and say positive measure for the former 
object. Another way of making the distinction is to say (even more 
correctly) an E-valued measure for our map v:.@ - E. 

It is clear that E-valued measures form a vector space denoted by 
M'(.4, E), or simply M*. For such a measure, we define 


lvl] = |v] (X). 


Then it 1s verified at once that || || is a norm (not merely a seminorm) on 
M’. In fact, M* is complete, i.e. a Banach space. The proof is a routine 
e/2” proof which we leave to the reader. Theorem 3.2 shows that the 
complex measures on .@ are precisely the complex valued, countably 
additive functions on .Z. 


Note. Our terminology is adjusted to the applications we are going to 
make. It would be more proper to define an E-valued measure to be 
simply a countably additive map v:.@—E such that v(@)=0, and 
define then a bounded measure to be such a map that |v|(X) < oo. In the 
sequel, we are concerned only with bounded measures or with complex 
measures (which are automatically bounded), so that we have taken the 
convention as described above. 


Example. Let p be a positive measure on .@ and let fe #'(p). 
Define yu, by 


My(A) = { s du. 


Then it is immediately verified that y, is a measure, and that 


eral < Ifill. 


We shall prove the reverse inequality after a remark, which it 1s conve- 
nient to formulate in a slightly more general context than we need for 
the next theorem. Let yu be a positive measure, and let v: 4 — E be an 
E-valued measure. We say that v is absolutely continuous with respect to 
u, or to put it shortly is y-continuous, if either one of the following two 
conditions is satisfied. 
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AC 1. If Ae and p(A) = 0, then v(A) = 0. 
AC 2. Given ¢, there exists 6 such that p(A) < 6 implies that |v(A)| < «. 


We shall prove that these two conditions are equivalent. It is clear that 
AC 2 implies AC 1. Conversely, assume AC 1. If AC 2 is false, for 
each positive integer n there exists a set Y¥ such that p(Y,) < 1/2”, but 
[v(Y,)| > e. Then |v|(¥,) > 6. Let 


LZ, = YU M41 
and let Z=()\Z,. Then p(Z) = 0, but 
|v|(Z) = lim |v|(Z,) 2 € 


because Yc Z,. Hence there is some measurable subset Z’ of Z such 
that v(Z’) #0, contradicting AC 1 because p(Z’)=0. This proves the 
equivalence between our two conditions. 


Remark 1. If fe £*(u), then the measure pu, is obviously p-continuous. 


Remark 2. The measure v is p-continuous if and only if |v| is p- 
continuous. 


Theorem 3.3. Let u be a positive measure on M and let fe L'(p). 
Then 


ell = WF la. 


The map f+ pu, is a norm-preserving embedding L}(n) > M?. 


Proof. It suffices to prove the inequality ||f\|, S |lu,;||. We may as- 
sume ||f||, > 0. Given ||f||,, there exists a set A of finite measure such 


that 
[ ifldu-es| ifidu 
xX A 


By p-continuity, there exists 6 such that if Z is a set with p(Z) < 6, then 


| ifldu<e 
Z 


By the fundamental lemma of integration (Lemma 3.1 of Chapter VJ), 
there exists a step map @ and a set Z of measure < 6 such that Z is 
contained in A and such that we have for all x e A — Z: 
Ife) — eX) < — 
p(A) 
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Write 
9 =) YiXap 


i=1 


where {A;} is a partition of A — Z, and let @ be 0 outside A. We have: 


[ if| iu-esS | if us| (io +o) de 


< |yp|(A) + 2€ S |npl(X) + 2¢. 
This proves our theorem. 


Corollary 3.4. For any step map g we have 


| g d|p,| -| g\f| du. 
xX xX 


Or symbolically, on step maps, 


d\uy| = [fl du. 


Proof. For each measurable set A, we can apply Theorem 3.3 with 
respect to A and get 


| 4y|(A) -| If | dp. 
A 


The result for step maps follows by linearity. 


We shall interpret measures as functionals. Let E be a Hilbert space 
or the complex numbers and let v be an E-valued measure on .4. We 
first view v as inducing a linear map on step mappings with respect to 
|vj. Let @ € St(|v|) and let us write 


where {A,,...,A4,} iS a partition of a set A having finite |v|-measure. We 
define dv by 


o dv = <g, dv) = ¥ vv(A)). 
i=1 


x 
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This obviously satisfies properties similar to those of the integral of 
Chapter VI, §2. Note that we wrote v;v(A;) instead of 


<v;, V(A;)> 
to fit the notation of functions better. In particular, since 


|v(Aj)| S |v] (Aj), 


we have the inequality 


Kg, dv>| S$ | lp| dlv| S lola l1xl. 
x 


where the L?-seminorm is taken with respect to |v|. (There is no other 
positive measure floating around at the moment.) Consequently dv is 
L?-continuous on St(|v|) and can thus be extended to a unique functional 
on L?(|v|) since the step maps are dense. By the L?-duality, we know 
that there exists a unique (up to a set of |v|-measure 0) map he #?(|v|) 
such that on all step maps, 


dv =hd|v|. 


In other words, for all step maps @ we have 


(9, dv> = | oh dlyl. 
X 


We shall say that dv is represented by h. Since |v| is finite, we know that 
h is in #'(\v|) (Schwarz inequality on |h| and 1,). 

We state the next theorem first for the complex numbers, for the 
convenience of the reader interested only in the complex case. 


Theorem 3.5. Let v be a complex measure on M. There exists a 
measurable function h on X such that |h| = 1 and such that for all 


g € St(|v|, C) 
we have 


<p, dv» -| ph d\v. 
x 
This function h is uniquely determined up to |v|-equivalence. 


Proof. We have already found such an h in #' and we must show 
that |h| = 1. We may assume that h is measurable. For r>0 let S, be 
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the set of all xe X such that |h(x)| <r. Let {A,} be a decomposition of 
S,. Then 


<) i rlvi(A,) = rlvi(S,). 


» |¥(A,)| = > | Xa, d|v| 
n n X 


This shows that |v|(S,) <r|v|(S,). If r<1, we must have |v|(S,) = 0. 
Hence |h(x)| 2 1 for almost all x. Changing h on a set of measure 0, we 
may assume |h(x)| 2 1 for all x. 

For the reverse inequality, let A be a measurable set. Then from the 
definition of h we have 


\ tah d\v| = <x4, dv) = |v(A)| S |v| (A). 


The averaging theorem (Theorem 5.15 of Chapter VI and its corollaries) 
shows that |h| < 1 almost everywhere. This proves our theorem. 


Corollary 3.6 (Hahn—Jordan Decomposition of a Measure). Let v be a 
real valued measure on and define 


vt =F(\v]+v) and vi =3(|v| —v). 


Then the expression v=v* —v_ gives a decomposition of v into a 
difference of two mutually singular positive measures, and any such de- 
composition is uniquely determined. If X = AUB is a decomposition 
into two disjoint measurable sets such that v* is carried by A and v~ is 
carried by B, then 


v"(Y)=supw(Z) for ZemM and ZcY; 
—v (Y)=infv(Z) for ZEM and ZcY. 


Proof. We sketch the proof. By Theorem 3.5, there exists a real 
valued function h such that |h| = 1 and 


dv =hd|v|. 


Then h takes on only the values 1 and —1. Let A be the set of points 
where h takes the value 1, and let B be the set where h takes the value 
—1. Let v, and v, now be defined by the formulas 


ve =|, = and vy = IVa. | 


It is then clear that v, and v, are mutually singular, and it is immedi- 
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ately verified that vj =v*, vy, =v . We leave the uniqueness and the 
proof of the last properties as an exercise. 


Vil, §4. COMPLEX OR VECTORIAL MEASURES 
AND DUALITY 


In this section we discuss the duality arising from complex or Hilbert 
space valued measures. We let E be a Hilbert space, which the reader 
may assume to be C in first reading, although as usual, no changes 
would be needed. 


Theorem 4.1 (Hilbert Case). Let E be a Hilbert space and let v be an 
E-valued measure on M. There exists a measurable map h: X > E such 
that |h| = 1 and such that for all g € St(|v|, E) we have 


(9, dv) = [ <p, hp dy}. 


This map h is uniquely determined up to |v|-equivalence. 


Proof. Identical with that of Theorem 3.5, except that we must insert 
unit vectors e and write ey, or ex, in the appropriate place. 


Corollary 4.2 (Radon—Nikodym, Hilbert Case). Let E be a Hilbert 
space. Let pu be a a-finite positive measure on M, and let v be an 
E-valued measure on M such that v is p-continuous. Then there exists 
f ¢ L(y, E) such that v = p,, uniquely determined up to p-equivalence. 


Proof. We can write (by the real form of Radon—Nikodym) 
d|v) =k du 


with some positive k in #'(y, R), whence by the theorem, on step maps 
we get (cf. Exercise 15) 
dv =hd|\v| = hk du, 


as was to be shown. 
If uw is a positive measure on .@, we can now associate with each 


u-continuous E-valued measure v on .@ a functional, again denoted by 
dv, on L®(u, E). Indeed, if we write 
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with fe #'(, E), then we define dv by 


<g, dv) = { <9, f > du. 


Let us denote by M'(p, E) the vector space of u-continuous E-valued 
measures on .@. 


Corollary 4.3. Let pu be o-finite, and E a Hilbert space. We have 
arrows: 


L}(u, E)> M*(u, E) > L(p, Ey. 


The first arrow, given by ft+p,, is a norm-preserving isomorphism, 
between L}(u, E) and M'(u, E). The second, given by v+> dv, is a norm- 
preserving anti-linear map of M'(u, E) into the dual of L®(u, E). If 
v=, with fe £*(u, E), then 


|dv| = |lvll = [flla. 


Proof. The norm statements are obtained by combining Theorem 3.3 
and Theorem 2.3. All other statements summarize what has already been 
proved. 


We now determine a necessary and sufficient condition for a func- 
tional on L®(p, E) to be expressible in the form f du, with some f in 
L'(u, E). In other words, we characterize the image of the map 


M*(u, E) > L*(p, EY. 
We shall say that a tunctional 
A: St(p, E) + C 


is “-continuous if there exists a positive real valued function t on .@ such 
that 
lim t(A) = 0, 


u(A)>0 


and such that for every g € St(u, E) we have 


IA(ga)l S Igil.t(A). 


Similarly we define y-continuity on the bounded measurable maps, taking 
g to be such a map. We recall that g, = yg. 
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Corollary 4.4. Assume that p(X) is finite. Every p-continuous func- 
tional on the step maps St(p, E) has a unique extension to a -continuous 
functional on L®(p, E). A functional 24 on L®(u, E) can be written in 
the form f du with some f €¢ L(y, E) if and only if it is u-continuous. 


Proof. If fe Z'(p, E), then for any bounded measurable g we have 


{. C9, f> au! S Ilgllo [. \f| du 


so that our condition of u-continuity is satisfied. Conversely, let 2 be a 
u-continuous functional on the step maps St(y, E). To see that an exten- 
sion to L®(y, E) is unique we note that if ge (yu, E), then given «¢ there 
exists a set Z with p(Z)<e and a sequence of step maps {,} which 
converges uniformly to g outside Z. This is true because on a set of 
finite measure, every bounded measurable map is in #', and the funda- 
mental lemma of integration (Lemma 3.1 of Chapter VI) gives us such 
approximation. It follows that any p-continuous extension of / to all of 
L~(pu, E) is uniquely determined. 

We prove existence by representing 4 on the step maps as dv for some 
measure v. For each fixed measurable A we consider the map 


vi A(vzX4)s ve E. 
This map is obviously a functional on E, and hence by the self duality of 


Hilbert space there exists a unique vector v(A) such that for all ve E we 
have 


Moxa) = <v, v(A)>. 


The finite additivity of v follows from the additivity of 4. Furthermore, 
we have the estimate 


|<v, vA)>| = |A(ox4)| S |v] t(A). 
This yields 
|v(A)| S t(A). 
Let {A,} be a decomposition of A, and let 
B, = A, U'':UA,, 
Then v(B,) = v(A,) + °°: + v(A,), and 
|v(A) _ v(B,,)| — |v(A _ B,)| S (A _ B,). 


The right-hand side tends to 0 as no, so that v is countable additive. 
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As for its total variation, let e, be the unit vector in the direction of 
v(A,), and consider the series 


g= y CKXA,° 


k=1 


It is a measurable, bounded map. If n = m, we have 


2( y aks) < (Ann) 


=m 


where 4,,, = A,,U°'*UA,. Applying the Cauchy criterion, we have: 


1(g) = > Mexta,) = > Iv(A,)| = 1A(g)| 


and also by the hypothesis on 4, 
|A(g)| S t(A). 


Taking A =X shows that the total variation is finite, whence v is a 
measure. 

Finally, it is clear from the definition of v that 4=dv on the step 
maps. By Corollaries 4.2 and 4.3, if we write dv = f du for some f in 
(py, E) then we can extend dv to a p-continuous functional on Y(p, E). 
This proves Corollary 4.4. 


Example. Let X =[0,1] with Lebesgue measure yp. Let F be the 
space of continuous functions on X, so that F is a subspace of #~(p, C). 
It is easy to verify that if f, ge F are equivalent (i.e. equal almost 
everywhere), then they are equal, so that F is a subspace of L™(y, C). 
Let v be the measure which gives the set {0} measure 1, and gives a 
subset of [0,1] measure 0 if this subset does not contain 0. For any 
f € F we have 


| fas = f(0). 


This measure v is obviously not p-continuous, but dv is continuous for 
the L®(yu)-seminorm (actually a norm on F). We can extend the func- 
tional dv on F to all of L“(u, C) by the Hahn—Banach theorem to give 
examples of functionals on L®(u,C) which cannot be represented by 
u-continuous measures. 


Remark. The part of the proof showing that v is a measure does not 
depend in an essential way on the assumption that E is a Hilbert space, 
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and goes through with very minor modifications in the arbitrary Banach 
case. The definition of y-continuity of a functional 4 applies in this case, 
and one can characterize such functionals as measures in the following 
manner: 


Assume that p(X) is finite. Let E be a Banach space and E’ its dual 
space. There exists a unique norm-preserving linear map 


M*(y, E’) > L™(u, Ey 


from the space of p-continuous E’-valued measures into the dual space 
of L(y, E), denoted by vt+dyv, whose image is the space of p- 
continuous functional on L®(u, E), and such that on step maps vy, 
(v€ E and A measurable) we have 


(vx4 dv) = <v, v(A)>. 


The crucial part of the proof of the preceding statement, namely that a 
u-continuous functional can be written as dv, follows closely the Hilbert 
case proof of Corollary 4.4. See Exercises 16 and 17. 

There remains to determine when a given measure v can be written in 
the form py, for some fe ¥*(u, E), and E is an arbitrary Banach space. 
A complete answer is given in Rieffel’s paper [Ri], as follows: 


Rieffel’s Theorem. Let be o-finite and let E be a Banach space. Let 
m be an E-valued measure, which is pu-continuous. Either one of the 
following conditions is necessary and sufficient that m can be written in 
the form pu, for some f € L*(p, E): 


R. Given A measurable and 0 < p(A) < 00, there exists B<A with 
u(B) > 0 such that the average set 


Av,(m) = set of all m(Y)/u(Y), YcB, w(Y)>0 


is relatively compact. 

R’. Given A measurable with 0 < (A) < 00, there is some B- A anda 
compact subset K of E not containing 0 such that u(B)>O0 and 
m(Y) is contained in the cone generated by K for all Y < B. 


Note. The cone generated by K is the set of all positive finite linear 
combinations of elements of K. Condition R’ may be expressed by say- 
ing that m has compact direction locally somewhere. Condition R’ is 
obviously satisfied in the finite dimensional case. A discussion of the 
literature and applications will also be found in Rieffel’s paper. For an 
example when the measure m cannot be written as y,, even though it is 
u-continuous, cf. Exercise 21. 
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For more on the Radon—Nykodum derivative in euclidean space, and 
the relation between differentiation and integration, I recommend Smith’s 
book [Smi]. 


Vil, §5. THE L? SPACES, 1<p<o 

We let (X, 4, wt) be a measured space. 
In this section we give results analogous to those concerning L7’, re- 
placing 2 by a real number p with | < p. We need some inequalities to 


replace the Schwarz inequality. Throughout we let q be the positive 
number (necessarily > 1) such that 


and call gq the dual exponent of p. 
We have the basic inequalities for real a, b > 0: 


(x) girpia 44? 
7 q 
Dp 
(xx) (* “ ") < (a? + b?). 


There are several easy proofs for this. Either take the log of both sides 
and use the convexity of the log, or proceed as follows. If t 2 1, then 


pect! 


as one sees by differentiating both sides, evaluating at t = 1, and seeing 
that the derivative on the right is bigger than the derivative on the left. 
Suppose now that a/b 21; the inequality (*) drops out at once. The 
other is proved similarly. 

We let #?() be the set of maps f on X which are p-measurable, and 
such that | f|? lies in #?. 


Theorem 5.1. Let 1<p< oo. Then #?(y) is a vector space. If we 


define 
1/p 
If\>= (|. Pad iu) ; 


then || ||, is a seminorm on L?. If fe Y(p) and ge L%(y), then |f\ |g 
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is in L' and Holder's inequality holds, namely 


[ Ifllgldu S if llpligila- 


Proof. We see that Y? is a vector space directly by applying the 
inequality (**). If ||f||,=0 or llgll,=90, then f or g is 0 almost every- 
where and the Holder inequality is obvious. Suppose that ||f||, #0 and 
lg, #0. Let 

Fl? \g\? 
a= and b = —_.. 
IF lle loll 


Using inequality (*), we find that 


p q 
If lal LIP | 1 ial 


If lp liga PSI 4 ligt 


First this shows that | f||g| is in Y* (corollary of the dominated conver- 
gence theorem), and second it yields the last inequality stated in the 
theorem, after we integrate over X. To show that || ||, is a seminorm, 
write 


If+oP SIF +o ' +lallft+gP. 


Integrating and using HO6lder’s inequality yields the fact that || ||, is a 
seminorm, and concludes the proof of the theorem. 


We are now in a position to prove many of the results of §1 hold if 
one replaces L? by L? with 1 < p< o. 


Theorem 5.2. Let {f,} be an L?-Cauchy sequence in #?. Then there 
exists some f € L” having the following properties: 

(i) The sequence { f,} is L?-convergent to f, so that Y? is complete. 
There exists a subsequence having the following properties: 


(ii) This subsequence of { f,} converges almost everywhere to f. 
(iii) Given e, there exists a set Z with u(Z)<e such that the conver- 
gence of this subsequence is uniform on the complement of Z. 


Proof. Identical with that of Theorem 1.4. 


Theorem 5.3 (Dominated Convergence Theorem for L?). Let {f,} be 
a sequence in £? which converges pointwise almost everywhere to f. 
Assume that there exists g€ L(y, R) such that g 20 and such that 
\f,| << g. Then f is in £? and { f,' is L?’-convergent to f. 
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Proof. As before. 


Corollary 5.4. The step maps are dense in #?, and L?(p) is the comple- 
tion of the step maps in the L?-seminorm. 


Proof. As before. 


Finally, the duality statement holds. Over C, we let <f,g> = fg. 
Theorem 5.5 and its proof are true as usual for a Hilbert space E and 
E-valued maps f, g, with <f, g> denoting the scalar product of the values 
of f and g. 


Theorem 5.5. Assume that p is o-finite. For fe L?(u) and ge F4(p), 
we let 


Ou = [ Cf: g> ap, 


and define A, by A(f)=<fg>,. Then the map gtd, is norm- 
preserving isomorphism of L‘() onto the dual space of L?(). 


Proof. We consider first as usual the case when p(X) is finite. Our 
map gt A, is certainly an injective linear map, and we have 


[4g S Ilgllq 


by Holder’s inequality. Let us prove that it is surjective. Let A be a 
functional on L?(u). Then 4 can be viewed as a p-continuous, p-bounded 
functional on L™(y) because if g is a bounded measurable map, then g is 
in (yu) and if C = |A|, then 


1/p 
IA(g)l = Cligl, = c( | \g|? au) < Cllgllou(X)"”. 
xX 


If we replace X by A for any measurable A, and g by gy, we get 
the same estimate with p(X) replaced by y(A). We can therefore apply 
Corollary 4.4 of the Radon—Nikodym theorem (vectorial case). There 
exists a map f € #'(u) such that dv = f du as a functional on L“(y). We 
shall prove that in fact, f lies in @4(y). Let Y, be the set of x such that 
| f(x)| <n. We first get a bound for the integral of |f|? over Y,. Let 


_ Sf q-1 
9 = Fall 


and let g, be equal to g on ¥, and 0 outside ¥,. (That is g, = xy,g.) As 
usual, dividing by |f| is to be understood as 1/|f(x)| if f(x) #0 and 0 if 
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f(x) = 0. Then g, is bounded, |g|? = |f|?, and 
(1) | CIn> I> au -| |F\4 du 
x Y, 


1/p 
=A9,) SC\lIullp S (| Fl au) 


Yn 


From this we conclude that 


1/q 
(| xy, |F |" iu) <C. 
»,¢ 


By the monotone convergence theorem, it follows that |f|? lies in #’, 
whence | f| lies in #4 and ||f||, S |Al. 

The functionals 4 and f du have the same effect on step maps, which 
are dense in ¥?. Therefore they are equal on #?(y). This proves our 
theorem when p(X) is finite. 

As for the o-finite case, we consider a decomposition X = |) X; (dis- 
joint union of sets of finite measure). For each X, we can find a function 
f, that lies in #4(p) and is 0 outside X,, and such that f, du represents / 
over X,. Let he #?(u) be arbitrary and let h, be the same map as h on 
X, and 0 outside X,. Then the series 


aL 


rod 
Il 


is L?-convergent to h, say by the dominated convergence theorem, and 
therefore by the continuity of A we have 


th =. A(hy). 


For each k we have Ah, = <h,, fd, If we let f=) f,, it follows that 
A4=f du on ¥?(u). This concludes the proof of the L?-duality theorem. 


Remark. The proof follows the classical pattern (see Rudin [Ru 1] or 
Loomis [Lo]), granted the L? and (L', L®)-duality theorem. For the 
general case when E is a Banach space, and one wants L‘(p, E’) to be 
dual to L?(p, E) for 1 < p < oo, cf. Dinculeanu [Din], §13, Corollary 1 of 
Theorem 8, where this is proved under some countability assumption. 


The next theorem gives an example of an integral operator in the 
fairly general setting of L?-spaces. 
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Theorem 5.6. Let 1< p< and C>0. Let K be a measurable func- 
tion on X x X such that 


[1K 9 duly) = € forall xexX 
X 


and 


| |K(x, y)| du(x) $C forall yeXx. 
x 


Let f € L?(u). Then the function S, f defined by 


Sef) = {KOs 0/0) du 


is defined for almost all x, and is in L?(u). Furthermore, 


ISkfllp S$ Cllfilp- 


Proof. We leave the proof as an exercise. The L?-case is especially 
interesting. Cf. Exercises 9-13 of Chapter XVII. 


Vil, §6. THE LAW OF LARGE NUMBERS 


I cannot resist giving an application of integration theory to a proba- 
bilistic setting which shows integration theory at work. This consists of 
the “law of large numbers” in a suitable formulation. I follow the expo- 
sition of [La—T]. This section can be read immediately after §1, as an 
application of the definitions and convergence theorems in §1 concerning 
L?. 

We assume that the reader has done the exercise of extending the 
notion of product measures to denumerable products. Specifically, we 
use the following theorem. 


Let (X,,4ns Mn) be a sequence of measured spaces such that u,(X,) = 1 
for almost all n (meaning for all but a finite number of n). Let @ be 
the o-algebra in the product space 


x=]1X, 
generated by all sets 
A=[] 4, 


where A,€%,, and A, =X, for almost all n. Then there exists a 
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unique measure pw on (X, 4) such that for all such sets A we have 
u(A) = |] Hn(A,). 
We call « = (x), the product measure. 


In the sequel we assume that in fact y,(X,) = 1 for all n. We view X 
as our probability space. 


Theorem 6.1. Suppose given a measurable subset S, of X, for each n. 
Assume that the limit exists, 


lim y,(S,) = L. 


no 


Then for almost all elements (sequences) x = {x,} in X, the density of n 
such that x, € S, exists and is equal to L. This means: 


. #{nSN,x,€S,} 
] Ee 
Now N 


L. 


The above theorem has a simple intuitive content, but some applica- 
tions require a stronger version, as follows. 


Theorem 6.2. Suppose given a measurable subset S, of X,, for each n. 
Let {b,} be a sequence of positive real numbers tending monotonically to 
infinity. Assume that 


1 
» pz HnlSn) < @. 
Then for almost all sequences x we have 
N 
#{n<N,x,€S,} = ) a(S.) + o(by), 
n=1 


where o(by)/by 70 as N > oo. 


The first theorem is obtained from the second by putting b,=n. We 
shall now prove the theorem. 

The first lemma, due to Kolmogoroff and formulated by him in proba- 
bilistic terms, will be a refinement of the fundamental lemma of integra- 
tion theory, which asserts that given an L’ (or L”) Cauchy sequence, 
there exists a subsequence that converges absolutely almost everywhere. 
Here we give up on absolute convergence, but have conditions which 
make the full sequence converge pointwise almost everywhere. 


[ VII, §6] THE LAW OF LARGE NUMBERS 215 


Lemma 6.3. For each n let h, be a function on X,, also viewed as a 
function on X by projection on the n-th factor. Assume that 


| h, du, = 0. 
Let 
H,(x) = ¥ hyo) 
k=1 


be the partial sum. Assume that )° ||h,\|3 converges. Then the limit 


lim H,(x) 


noo 
exists for almost all x € X. 


Proof. We first note that the functions h, are mutually orthogonal 
on X. The heart of the proof lies in the next statement. 


Kolmogoroff’s Inequality. Given ¢, let 


z=\xeX, max HR) 2 of 


1<k<n 
Then 
MZ) SY Wl 
Proof. Let 


¥, = {x eX such that H?(x) 2 ¢ and H?(x) <é for all i < k}. 


In other words, ¥, is the set of points x such that H?(x) is the first 
partial sum at least equal to «. Then the sets ¥ are disjoint, and we get 
the inequality 


é 2» U(X) S » Hy. 


snJyY, 
Write 
HR = H? ~~ 2H,(H, _ H;,) ~~ (H, ~~ H,)?. 


The last term is negative, and we shall leave it out when we integrate. 
On the other hand, the middle term gives 


| A,(H, — H,) du = 0. 
Y, 
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This holds because H, is effectively a function of only the first k vari- 
ables, whereas H, — H,, is effectively a function of only the last n—k 
variables. The integral splits into a product of integrals over the distinct 
variables, and is immediately seen to yield 0, as desired. Therefore we 
can replace H7 by H? and then integrate over all of X, thereby by giving 
as bound the square of the L?-norm of ) h,, which proves the asserted 
inequality. 


We have assumed that )'h, is in L?, that is 
= 2 
>» WAxll < 00. 
k=1 
This means that for mg sufficiently large, and n 2 mZ mp, we get 


u{x € X, max(H, — H,,)*(x) 2 e} S— ), Walla < e. 


Define 
Z; = {x € X, (H, — H,,)?(x) 2 1/2' if m, n = mo(i)}. 
Then Z, has measure < 1/2' if we pick m,(i) sufficiently large. Let 
W,=Z,09 Lagi Uo" 


for large n, so that W, has measure < 1/2""*. Then the partial sums 
)'h,(x) converge for x not in W,. Hence if we let W be the intersection 


W=(\W,, 


then these partial sums converge for x not in W, and W has measure 
zero, thereby proving the lemma. 


The next theorem is also due to Kolmogoroff, in that generality. 


Theorem 6.4. For each n let f, be a function on X,,, and assume that 


[J du, = 0. 


Let {b,} be a sequence of positive real numbers monotonically increasing 
to infinity. If 


1 
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then for almost all x the partial sums 


Fix) = YA 009 


satisfy the estimate 
F(x) = 0(5,). 


Proof. Let h, = f,/b, and apply the lemma to the partial sums 


Hy(s) = ¥ h(x) = 


The lemma says that these partial sums converge for almost all x. It is a 
trivial fact (proved by summation by parts) that if }/a, is a convergent 
sequence, then 


» a,b, = o(b,). 
k=1 
Applying this fact when a, = h,(x) proves the theorem. 
We have stated Theorem 6.4 under the normalization that the integral 
of the functions f, is 0. This is of course not satisfied in general, but a 


translation reduces the general case to this special case. Indeed, suppose 
that w, are functions such that 


|v dp, = Ch 


is a constant c,. Define 
Sn = Wn — Cy. 


Then the integral of f, is 0. In particular, suppose that yw, is the charac- 
teristic function of some subset S, of X,. Then 


AE = | ~ Cn)? du, = Cy — Ch. 


Applying Theorem 6.4 in this situation yields Theorem 6.2, as desired. 


Vil, §7. EXERCISES 


Unless otherwise specified, (X,M, 1) is a measured space. 


1. Let x be an algebra in X and yu a positive measure on .. Assume that all 
elements of . have finite measure. For A, Be w define 


d(A, B) = p(A — B) + p(B — A). 
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Show that d is a semimetric [in the obvious sense, that is d(A, B) 2 0, 
d(A, B) = d(B, A), 


and the triangle inequality is satisfied]. The only difference from a metric is 
that we may have d(A, B)=0 and yet AB. In this way, w becomes a 
topological space, and y-continuity corresponds to the topological notion. 


2. Radon—Nikodym derivative. Let pu, v be positive measures, and let m be a 
complex measure. Suppose that dm =f du, where fe #'(u) and du=g dv 
where ge #'(v, R). Prove that fge #'(v), and that dm = fg dv. If we use the 
notation dm/du = f and du/dv = g, then we have the old formalism 


One sometimes calls f the Radon—Nikodym derivative of m with respect to yu. 
[By the way, you may view dm or du, dv as linear maps on step functions, 
which amounts to considering measures m = py OF pf = Vy.] 


3. Let X consist of two points x and y. Define y({x}) = 1 and 
u({y}) = w(X) = ow. 


Determine whether L®(, R) is the dual of L(y, R). 


4. Let F be a subspace of L?(u, C) and assume that there is some number c > 0 
such that for all f in F we have 


fl secllflle- 
Assume that u(X) < oo. Show that F is finite dimensional and that 
dim F < cy (X). 
[Hint (Moser): Let f,, ...,f,, be orthonormal elements in F. Let 
b? = ess sup(|f,|? + °°: + |f,l*). 
Let x» be a point such that 
Y |filxo)/? 2 b?(1 — 8) and also > b? — sb. 


Consider the function f = )\ a, f, with «, = f,(Xo)/b.J 


5. Let X be the set of positive integers, and let » be the counting measure on 
X which gives each point measure 1. Let /' and [~ denote L'(y,C) and 
L(p, C). 

(a) Show that I! consists of all complex sequences « = {a,} with norm 


lll, =), lagl < 0. 
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Show that /© consists of all complex sequences « = {a,} with norm 
lel = sup|a,| < 00. 


(b) Show that /' is the dual of the subspace C, of /° consisting of all 
sequences « = {a,} such that a,0 as n-+ 00. Show that the dual of I’ 
is /* quoting any theorem from the text. 

(c) Show that C, and |' are separable, but that |” is not separable. 


The space H,. (For applications to PDE, cf. SL,(R), Appendix 4.) 


6. Let s be an integer. On the integers Z define 
y(n) = (1 +n?) 


Then p, is a measure on Z. 
(a) Define the space H, to be the space of functions on Z, written in the form 
of sequences {a,‘, such that the sum 


> (1 +n?) a, |? 
converges. If f = {a,} and g = {b,}, define the scalar product in H, to be 
<f.9> = ¥ anb,(1 + 07). 


Show that H, = L?(Z, y,), and in particular is complete for the norm 
associated with this scalar product. 

(b) Show that the finite sequences f = {a,} such that a,=0 for all but a 
finite number of n form a dense subspace of H,. 


7. For each function f ¢ C%(T), where T = R/Z is the circle, or if you wish, for 
each C® function on R, periodic of period 1, associate the Fourier series 


1 
f(x) = ¥a,e?""* ~~ where a, = | f(t)e"?2™""* dx. 
0 
(a) Integrating by parts, show that the coefficients satisfy the inequality 


i 
|a,| <«< — 


[n" 


for each positive integer k. The symbol « means that the left-hand side 
is less than some constant times the right-hand side for |n| — oo. 

(b) Prove that C*(T) c L?(Z, yu,) for all se Z, and that C%(T) is dense in this 
space L?. [Look at the finite Fourier series 


y A, e2ninx |] 


Ini SN 


8. Let r<s. Prove that the unit ball in H, is relatively compact in H,, in other 
words that this unit ball is totally bounded in H,. 
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9. Let {f,} be a sequence in #7(X, yp) such that || f,||, > 0 as n— oo. Prove that 


lim | |fu(x)| log(1 + | f,(x)|) du(x) = 0. 
n>o JX 


10. Let 1 < p< o and let fe Y(R) (for Lebesgue measure). For ae R let T; be 
the translation by a, that is T; f(x) = f(x — a). Prove that T,f converges to f 
in L? asa-—0. Is the conclusion still true if p = «0? 


11. Let pw be a o-finite positive measure on the Borel sets in R, and suppose 
L'(R, ») < L®(R, p). Show that there exists c > 0 such that if A is a Borel set 
with p(A) > 0 then p(A) 2 c. 


12. Prove Theorem 5.6. [Hint: Use H6lder’s inequality and Fubini’s theorem. ] 


13. Let T: L?(X, p) > L?(X, p) be a continuous linear map, and assume that X is 
o-finite. Assume that T commutes with all operators M, such that M,(f) = 
gf, for ge £° and fe *. Prove that T= M, for some g. [Hint: Write X 
as a disjoint union of sets of finite measure X, and let @ be the function 
which is the constant 1/n?(X,)!? on X,. For fe 2° nL’, we have 


T(ef) = oT(f) = fT(@). 


Let g = Tg/gy. Then Tf = gf. Prove that g is bounded as follows. If it is 
not, given N there is a subset of finite positive measure Y such that |g| 2 N 
on Y. Consider T((G/g) yy) to contradict the boundedness of T.] 

For an application, see SL,(R), Lemma 4 of Theorem 4, Chapter XI, §3. 


14. Let E be a Banach space and let v:.@-— E be an E-valued measure. Show 
that one can define (in a manner similar to that in the text) a linear map 


St(|v|, C) > E, 


and that this map is L’(|v|)-continuous. This linear map can therefore be 
extended linearly by continuity to #'(|v|,C), thus allowing you to define 
{f dv, for fe Z*(|vI, ©). 


15. Let E be a Banach space and E’ its dual. In the bilinear map 


L'(p, E) x L@(p, E') > C 
given by 


(ADS Oy = | fi g> du 


show that |A,| = || fll, and |A,| = |lg||.., just as in the Hilbert case. [Hint: Use 
step maps, and for a constant map, use the Hahn-—Banach theorem to see 
that given ve E, there exists v’ € E’ such that |v’| = |v| and <v, v’> = |v].] 


16. Assume that u(X) is finite. Let E be a Banach space and E’ its dual space. 
The definition of a y-continuous functional 2 on L(y, E) is as in the text. 
Show that such a functional can be written in the form 4=dv for some 
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18. 


19. 


20. 


21. 


E’-valued measure v, in the sense that on a map vy, (vé E and A measur- 
able) we have 
Avx4) = <v, V(A)>. 


. Prove the statement included in the remark following Corollary 4.4 of Theo- 


rem 4.1, concerning that part of the dual of L*(y, E) represented by a mea- 
sure in E’. 


Let f be a u-measurable map of X into a Banach space E. Given a measur- 
able set A with (A) finite, and ¢, show that there exists Z <A such that 
u(Z) <e and f(A — Z) is relatively compact (or equivalently, totally bounded). 
We may say that f is locally almost compact valued. 


The essential image. Let E be a Banach space. Let f be a measurable map 
and let A be a measurable set. The essential image of f on A is defined to be 
the set of all ve E such that for every r > 0 the measure of the set 


An f~(B,(v)) 


is strictly positive. We denote it by ei,(/). 
(i) The essential image is closed. 
(ii) If u(A) > 0, then ei,(/) intersects the image f(A). 
(iii) The set Z of elements xe A such that f(x) does not lie in ei,(f) has 
measure 0. 
(iv) Let A = |) A, be a denumerable union of measurable sets. Show that 


ei,(f) = closure of v el, (f). 
n=1 


Let E be a real Banach space, and f: X +E any map. Let g: X ~R be a 
real positive function on X which is in #'(p, R) and such that 


| g du > 0. 
xX 


Assume that of is in £'(p, E). If A is a functional on E, and ce R, we define 
a half space H*(A, c) to consist of all ve E such that Av 2c. Let H be such a 
half space containing f(X). Show that 


fx af du 
Ixf du 


belongs to H. 

In view of the result on convex sets in §2 of the Appendix to Chapter 4, it 
follows that the above “average” in fact lies in the closure of the convex set 
generated by the image f(X), ie. the smallest closed convex set containing 


I (X). 


Let E = L(y, C) where X = [0, 1] and p is Lebesgue measure on the algebra 
of Borel sets. For each Borel set A let 


m(A) = class of x, in L'(, C). 
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(a) Show that |m| is Lebesgue measure itself. (b) Show that m is an E-valued 
measure which cannot be written as w,. [Hint: View dm as a functional 
on step functions, say real valued, so that for any step function @ and 
measurable set A we have 


{. gdm = [ of du.] 


22. Let E be a Banach space. Let P denote the set of all partitions, i.e. collec- 
tions x consisting of a finite number of disjoint measurable sets of finite 
measure. We let x, 2 x if every element of z is, up to a set of measure 0, the 
union of elements of z,. For each ne P and fe #'(, E) we define 


f, = T,f = » [uy(A)/u(A)] x4 


AER 


where p1,(A) = |, f dp. 

(a) Show that T,: L'(u, E) > L'(y, E) is a continuous linear map of norm 1, 
and that T,f is L'-convergent to f in the following sense: Given « there 
exists 2, such that for all 2 2 2) we have 


ITf —flli<e. 


(b) Prove the same thing replacing 1 by p for 1 < p< oo. 


CHAPTER VIII 


some Applications of 
Integration 


After the abstract theory on arbitrary measured spaces, it is a relief to 
get into some classical situations on R” where we see the integral at 
work. None of this chapter will be used later, except for the approxima- 
tion by Dirac families in the uniqueness proof for the spectral measure of 
Chapter 20. 

In this chapter we deal with the Fourier transform in a context of 
absolute convergence. In Chapter 10, §2 we shall deal with a more deli- 
cate context, involving oscillatory convergence. 


Vill, §1. CONVOLUTION 


Suppose first we deal with functions f, g on the real line. We shall study 
their convolution, defined by the integral 


f*g(y) = [ f(x)g(y — x) dx. 


Of course, the integral must be convergent, or even absolutely conver- 
gent. Theorems 1.1 and 1.2 will give conditions for such convergence. 
Furthermore, we don’t need to work only on R, and we shall express the 
results on R", abbreviating 


| _fedaly ~ x) dx = | fldgly — x) dx. 


In most applications, one of the two functions f or g is continuous or 
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even C%, and the resulting convolution is also continuous or C~. To see 
this one must be able to take a limit or differentiate under the integral 
sign, and the next section gives basic conditions under which this is 
legitimate. We shall see several examples after the main approximation 
theorem is proved in Theorem 3.1. 

We now come to the basic tests for absolute convergence of the 
convolution integral. 


Theorem 1.1. Let f, ge £'(R"). Then for almost all y € R" the function 


xt f(x)g(y — x) 


is in £1(R"). The convolution f *g given for almost all y by 


f*g(y) = | f(x)g(y — x) dx 


is also in L'. The association (f, g)-~ f*g is an associative, commuta- 
tive bilinear map, satisfying 


If*gll, SUF llalgls. 


Thus £'(R") is a Banach algebra under the convolution product. 


Proof. We integrate | f(x)||g(y — x)| first with respect to y, and then 
with respect to x. We apply part 2 of Fubini’s theorem, Theorem 8.7 of 
Chapter VI. We then conclude that f*g is in #'. The last inequality in 
the statement of the theorem follows at once. The bilinearity is obvious, 
and so is commutativity. The associativity is proved using Fubini’s theo- 
rem, and is left to the reader. 


Theorem 1.2. Let fe #'(R") and ge £?(R") with 1< poo. Then 
f*g(y) is defined by the integral for almost all y and is in #?. We 
have 


If*gllp SF llilgll- 
Proof. The case p = 1 is treated in Theorem 1.1. Suppose that p = oo. 
If fe #'(R") and ge Y*(R"), so we may assume g is a bounded measur- 


able function, then we may also form the convolution f*g given by the 
same formula 


f*gly) = | I(x)g(y — x) dx = {7 (y — x)g(x) dx. 


The integrals converge absolutely, and we have the trivial estimate from 
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the first integral, replacing g by its bound ||g||,,, namely 


IF * Ilo S WF ils Mall. 


as desired. 
Finally suppose that 1 < p. Let q be as usual such that 1/p + 1/q = 1. 
Then we have the inequality 


| seariaty — x) | FG)" dx 


1/p 1/q 
< i lg(y — x)|? ix {ire ax : 


from which we see that f*g is defined for almost all elements of R", and 
also that 


I(f*g)(y)P? S i) lg(y — x)? ax Pale 


We integrate and use Fubini’s theorem, obtaining 


Ife gle S la clle sls ISIN. 


But the L?-seminorm is invariant under translations, ie. ||g,||, = Ilgll,- 
Since | + p/q = p, we take the p-th root to obtain 


If *9l> S lglpifla, 


thus proving our theorem. 


Vill, §2. CONTINUITY AND DIFFERENTIATION 
UNDER THE INTEGRAL SIGN 


Lemma 2.1. Let X be a measured space with positive measure p. Let 
U be an open subset of R". Let f be function on X x U. Assume: 


(i) For each ye U the function x+> f(x, y) is in Z*(w). 
(ii) For each x € X and yo € U, we have 


lim f(x, y) = f(x, Yo). 


(iii) There exists a function f, € L'(u) such that for all ye U, 


f(x, y)| SIA O)l. 
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Then the function 


yr [ f(x, y) du(x) 


is continuous. 


Proof. It suffices to prove that for any sequence {y,} converging to y, 


| F(X, Ye) du(x) converges to | F(x, y) du(x). 
xX X 
Let f,(x) = f(x, y,). Then {f,} converges pointwise to the function 


xt f(x, y), 


and by (iii), we can apply the dominated convergence theorem to con- 
clude the proof. 


Lemma 2.2. Let X be a measured space with positive measure pt. Let 
U be an open subset of R". Let f be a function on X x U. Assume: 


(i) For each y € U the function xt f(x, y) is in Z*(p). 


(ii) For each ye U, each partial derivative D,f(x, y) (taken with respect 
to the j-th y-variable) is in #*(). 
(iii) There exists a function f, ¢ Z*(u) such that for all ye U, 


ID: f(x, y)| S [fi 00)I. 
Let 


®(y) = [ F(x, y) du(x). 


Then D,® exists and we have 


D;®(y) = [ D; f(x, y) du(x). 


Proof. Let e; be the usual j-th unit vector in R". We have 
@(y + he,) — O(y) 1 
OB) OO) | Efe y + hey) — fla, v9] duo 
X 


Using the mean value theorem and (iii), together with the dominated 


convergence theorem, we conclude that the right-hand side has a limit, 
equal to 


| ; D, f(x, y) du(x). 
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[As in the previous proof, we have to use the device of taking a sequence 
{h,} to apply the dominated convergence theorem in its standard form.] 


Theorem 2.3. Let f ¢ £'(R") and let o be a C® function with compact 
support. Then f *@ is C® and in fact 


D°(f*o) = f*D?Q. 


Proof. We can form the convolutions by using Theorem 1.1 and we 
have 


f*o(y) = | fore — x) dx. 


Lemmas 2.1 and 2.2 show that f*q@ is C®, and allow us to differentiate 
repeatedly under the integral sign. 


Vill, §3. DIRAC SEQUENCES 


By a Dirac sequence on R” we shall mean a sequence of functions {@,} 
on R” real valued, continuous, satisfying the following properties: 


DIR 1. We have 9, 2 0 for all k. 
DIR 2. For all k we have | eo dx = 1. 


DIR 3. Given ¢, 6 > 0 there exists ky such that 


| o,(x) dx < & 
|x| 26 
for allk= ko. 


The third condition shows that for large k, the volume under @, is concen- 
trated near the origin. Thus in one variable, the sequence looks like this: 
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We have drawn the picture so that the sequence satisfies a property 
somewhat stronger than what is expressed in DIR 3, namely the support 
of gy, tends to 0 as k-> 00. We state this condition formally. By a Dirac 
sequence with shrinking support, we mean a sequence satisfying DIR 1, 
DIR 2, and the third condition: 


DIR 3s. Each , has compact support, and given 6, the support of 9, is 
contained in the ball of radius 6, centered at the origin, for all 
k sufficient large. 


To construct such a sequence we can start with a positive function 9g, 
continuous or even infinitely differentiable, having support in the ball of 
radius 1, centered at the origin, and such that 


| ee dx = 1. 


We then let o,(x) = k"g(kx). A sequence constructed in this manner will 
be called a regularizing sequence. It has additional properties besides 
those three of the Dirac sequence namely: the support of q, 1s deter- 
mined explicitly in terms of the support of @, and is contained in the ball 
of radius 1/k; in fact, it is contained in (1/k)supp g. In addition, the 
partial derivatives of g, can be easily estimated in terms of those of @ 
and k. This is frequently useful in applications when one has to make 
careful estimates on such derivatives. 

We now show how a Dirac sequence can be used to approximate a 
function. We shall prove first the main approximation theorem for #®, 
with condition DIR 3, and give some applications. Then we prove an 
analogous theorem with condition DIR 3s, and see how it implies ap- 
proximation results for functions in #? with p < oo. 


Theorem 3.1. Let f be a bounded measurable function on R". Let A 
be a compact set on which f is continuous. Let {g,} be a sequence 
satisfying DIR 1, DIR 2, DIR 3. Then o,*f converges to f uniformly 


on A. 


Proof. For x € A we have: 
— * f(x) = | ly) f(x — y) dy 
and by DIR 2, 


f(x) = f(x) | o,(y) dy = | F(x)@,(y) dy. 
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Hence 


Oy * I(x) — f(x) = | ao L f(x — y) — f(x)] dy. 


By the relative uniform continuity of f on A, given ¢, there exists 6 such 
that if |y| <6 then for all x € A we have 


I(x — y) — fQ)| < €. 


We then write 
P * f(x) — f(x) = | 7 + | os P(Y)LF(x — y) — f(x)] dy. 


The integral over |y| < 6 is then bounded by «e. For the other integral 
with |y| = 6, we use DIR 3 to conclude that this integral is bounded by 
2\|f\l,¢ for k sufficiently large. This concludes the proof. 


We shall now give classical examples of Theorem 3.1. 


Example 1 (The Landau Sequence and Weierstrass’ Approximation 
Theorem). By means of a suitable Dirac sequence one can give an 
explicit proof for Weierstrass’ theorem that a continuous function can be 
uniformly approximated by a polynomial on an closed interval. Suppose 
f is continuous on [a,b]. Making a translation and dilation of the 
variable if necessary we may assume that [a, b] = [0,1]. Let y = L(x) be 
the equation of the straight line passing through the end points of the 
graph of the function. Then L is a polynomial (of degree < 1), and 
f — L has the additional property that f — L vanishes at O and 1. Thus 
without loss of generality, to prove Weierstrass’ theorem, we may assume 
that f(0) = f(1)=0. We then extend f to R by defining f(x) =0 for 
x € [0, 1]. 


We now define the Landau functions 
I 2\k 
M(x) = — A — x" for |x| $1, 
k 
where the constant c, is taken to be 


1 
C, = | (1 — x)* dx, 
-1 
so that 


[ o,(x) dx = 1. 
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We define ,(x) = 0 if x is outside the interval [—1,1]. It is an easy 
exercise to show that {@,} is a Dirac sequence. We have 


1 


Oy (x — f(t) dt = | Oy(x — t) f(t) dt. 


0 


00 


DP, * f(x) = | 


— oe 


Furthermore, for x,te¢[0,1] the function (x —t) is a polynomial, 
namely 


p(x — t) = ¢,(1 — (x — t))* = d u,(x)t/ 


where each u; is a polynomial. Hence 
1 ° 
o, * f(x) = J aU (x) where Oo; = | t/f(t) dt, 
0 


and therefore ~,* f is a polynomial. By Theorem 3.1 the sequence {@,* f} 
converges to f uniformly on [0, 1], thus proving Weierstrass’ theorem. 


In some applications, we deal with periodic functions of period 27, 
and in this case a Dirac sequence is defined in an analogous way, taking 
integrals over an interval of length 2x. The next example is of this type. 

Example 2 (Cesaro Summation). Let f be a period continuous func- 


tion of period 2z. Let S,, be the n-th partial sum of the Fourier series 
for f, that is 


n . 1 {* . 
Sf) = ¥ ce" where c= x | f(je™ dt. 
k=-n 27 —-t 
Let A, be the average of these partial sums, that is 
1 
A, = 7, (80 +--+ + 8,4). 


Then a theorem of Fejer—Cesaro asserts that {A,} converges uniformly to 
f. This result is a special case of Theorem 3.1 as follows. Let 


mo 1 
De =—(Do + +--+ Dy 1) 
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Then simple manipulations will prove the identity 


1 sin? nx/2 
K = 
n(2) 2nn sin? x/2 


It is then easy to verify that {K,} is a Dirac sequence, that D,* f is the 
n-th partial sum of the Fourier series of f, and that K,+*/f is the average 
of the partial sums. Therefore by Theorem 3.1, the averages of the 
partial sums of the Fourier series converge uniformly to f. Do Exercise 2 
to carry out the details of this proof. 


In some cases, instead of considering a sequence of functions, one 
considers a family of functions indexed by some real numbers, as in the 
next example. 


Example 3 (Harmonic Functions and the Poisson Family). For 0 <r < 1, 
define the Poisson family to be 
P(6) = — ¥° rhein 
r 2n 


— 00 


Then P.(9) satisfies the three conditions DIR 1, DIR 2, DIR 3 where k is 
replaced by r and r— 1 instead of k > o. In other words: 


DIR 1. We have P.(@) = 0 for all r and all 0. 
DIR 2. For all r we have 


[ P.(6) d0 = 1. 


DIR 3. Given « and 6, there exists fo, 0 <1 < 1, such that if rp <r <1, 


then 
—-6é nt 
| p+| P.<e. 
—f% 6 


For DIR 3 you will prove and use the formula 


1 1 —r? 
P(6) = — 
) 2n 1 —2rcos 6+ r? 


Theorem 3.1 concerning Dirac sequences applies to the family {P,}, again 
letting r-1 instead of k- oo. In other words, let f be a bounded 
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measurable function on R which is periodic. Let S be a compact set on 
which f is continuous. Let 


Sr = P.*f. 


Then { f,} converges to f uniformly on S as r—1. 

The use of the Poisson family comes from the desire to solve a bound- 
ary value problem as follows. We are given a function f, viewed as a 
function on the circle, that is f(@) is periodic, as usual. We want to find 
a function on the disc, that is a function u(r, 0) with 0 <r < 1, satisfying 
the Laplace equation Au = 0, where A is the Laplace operator, given in 
polar coordinates by 


07u ldu 1 d?u 
Gr? or Or or? 66?’ 


and such that u has period 2z in its second variable, that is 
u(r, 0) = u(r, 0 + 27). 


We want u to be continuous, and we want u(1, 8) to be as much like f(60) 
as possible. If f is continuous on the circle, then we want u(1, 0) = f(@). 
The convolution 


u(r, 0) = (P+ f)(O) 


solves the problem, because AP = 0, so by differentiating under the inte- 
gral sign we find 


A(P * f) = (AP) * f = 0. 
Carry out the details as Exercise 3. 


Example 4 (The Heat Equation for the Laplace Operator). For t > 0, 
and a real variable x, define 
—x2/4t 


K,(x) = K(x, t) = (anni 


Then {K,} is a Dirac family, replacing k by t in the definition of Dirac 
sequences, and letting t > 0 instead of k > 00. We define the heat opera- 


tor to be 
0? 0 


~ Ox? at 


on functions of two variables (x,t). You can easily verify that HK = 0. 
Thus K satisfies the heat equation. On R", we can define K, and H 
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similarly, by 


| ae: 
K,(x) = K(x, t) = niyr2° xt 


Here x € R" is an n-tuple, and x* is the dot product of x with itself. The 
heat operator is then written 


7) 
H=A ap? 
where A is the Laplace operator, A = ))(0/0x;)”. Again HK = 0. 

It is Exercise 4 to verify that {K,} is a Dirac family. By arguments 
similar to those of Example 3, one verifies that for any bounded continu- 
ous function f, the function (x, t)}>(K,*f)(x) is a solution of the heat 
equation on R" x R. The Dirac family {K,} is called the fundamental 
heat family. 


For an example of a solution of the heat equation for periodic func- 
tions, see Exercise 7. 


Example 5 (The Heat Equation for the Schroedinger Operator). Recall 
first the elementary definitions of hyperbolic trigonometric functions: 


ete! eé’—e! 
cosh t = ; sinh t = , 
2 2 
tht cosh t osch t 1 
Cc = , ch t = ——. 
inh t sinh t 


Let A, be the 2 x 2 matrix given by 
A -( cotht —csch ) 
‘ \-ecscht  cotht/’ 
Define the Mehler family {F,} on R? by the formula 
F(X) = (esch t)¥#e7"$4:**> for x eR’. 
Thus F, is positive, real valued, and we can write 
F(X) = F(X, t) = F(x, y, 2) = F,x(y) 


if X is the transpose of the vector (x, y) in R*. Then: 
(a) For each fixed x, the family {F,.} is a Dirac family for t > 0. 
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1/a? , a 
L=2(2] — TX or 


Thus L is the heat operator for the Schroedinger operator 


(b) Let 


Then LF =0, where the partial differentiation is with respect to the 
variables (x, f). 

(c) For any reasonable function f of a variable y, we have the integral 
operation 


F « f(x, t) = [, F(x, ys t) f(y) dy. 


Suppose f is bounded and continuous. Then L(F*f)=0, 1e. Fx f 
satisfies the heat equation for the Schroedinger operator. 


The proofs are left as Exercise 5. Readers who want to see a general 
context for this example are referred to Howe’s article [How], Section 5. 
See also [HowT], Chapter III, §2, and Exercise 5 of that chapter. For a 
somewhat different context, see [BGV], 4.2, p. 154. 

For an example when most of the formalism of a Dirac family works 
but the family is not positive, see Exercise 7. 


Next we consider the use of Dirac sequences for approximation in #? 
with p < oo. We shall use DIR 3s, i.e. functions with shrinking support. 


Theorem 3.2. Let f be in Y'(R"), and let A be a compact set on which 
f is continuous. Let {g,} be a Dirac sequence with shrinking support. 
Then ~,* f converges to f uniformly on A. 


Proof. The proof is identical to the proof of Theorem 3.1 except for 
the following final modification. At the very last estimate, for k large, the 
support of g, is contained in the ball of radius 6, whence the integral 
expressing @, * f(y) — f(y) is concentrated on that ball, and is obviously 
estimated by e, thus proving our theorem. 


Corollary 3.3. The support of 0,» f is contained in 


supp f + supp @,. 


If f is continuous with compact support, then {o,*f} converges uni- 
formly to f on R". 
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Proof. We have 


Dp, * f(x) = | exo — t) dt, 


and this integral is concentrated on the support of @,. If 


Oy * f(x) # 9, 


then we must have x — te supp f, for some t € supp g,. Hence 


x € supp f + Supp @, 
thus proving our first assertion. The second assertion follows at once 


from the first, and from the theorem (both f and , being equal to 0 
outside some fixed compact set). 


Corollary 3.4. Let fe Y?(R") for 1< p<. Then {g,*f} is L? con- 
vergent to f. 


Proof. We know that C,(R") is L? dense in #?. Let geC,(R") be 
such that 


lf —glp<« 
Then we estimate 


Since 


| e9 dx = 1, 


we have ||9,||, = 1. Using Theorem 1.2, we find 


lO.*f — P* Illy = lor*(f — lp S IF — gp <e. 


By Theorem 3.2 and Corollary 3.3, {g,*g} converges uniformly to g, and 
has a support which lies close to that of g. This implies that 


lO.*9—-gllp<é 


for k large. The last of the three terms in our estimate above is < 6, 
thus concluding the proof. 
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Vill, §4. THE SCHWARTZ SPACE AND 
FOURIER TRANSFORM 


Let f be a function on R”. We shall say that f tends to 0 rapidly at 
infinity if for each positive integer m the function 


xro(1 + [x[)"f(x), x ER’, 


is bounded for |x| sufficiently large. Here as in the rest of this chapter, 
|x| is the euclidean norm of x. Equivalently, the preceding condition can 
be formulated by saying that for every polynomial P (in n variables) the 
function Pf is bounded, or that the function 


x t+ |x|" f(x) 


is bounded, for x sufficiently large (i.e. |x| sufficiently large). 

We define the Schwartz space to be the set of functions on R" which 
are infinitely differentiable (ic. partial derivatives of all orders exist and 
are continuous), and which tend to 0 rapidly at infinity, as well as their 
partial derivatives of all orders. 


Example of such functions. In one variable, e~*’ is one, and similarly 
in n variables if we interpret x” as the dot product x-x, which we also 
write x*. As a matter of notation, we shall write xy instead of x-y if x, y 
are elements of R”. 

If f is in the Schwartz space and P is a polynomial, then the product 
Pf is in the Schwartz space. 

If f is a C® function of one variable which is 0 outside some bounded 
interval, then f is in the Schwartz space. As an example, one can take 
the function 


e Va-aO-x) if q<x<b, 


IO) = ‘ otherwise. 
An analogous function in n variables can be obtained by taking the 
product 


F(X1)°* F%n): 


It is clear that the Schwartz space is a vector space, which we denote 
by Sch(R”) or simply by S. We take all our functions to be complex 
valued, so S is a space over C. 

We let D, be the partial derivative with respect to the j-th variable. 
For each n-tuple of integers = 0, p = (p,,...,p,,), we write 


D? = DP'--- DPr, 
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so that D? is a partial differential operator, which maps S into itself. As 
a matter of notation, we write 


[Pl = Pi +o" + Pr. 


It is also convenient to use the notation M,f for the function such that 


(Mf )(x) = xf). 
Thus, M; is multiplication by the j-th variable. Also 


M?f = MP!:-> MPnf, 
so that 
(MPf)(x) = xf? xpnf(x). 


In what follows, we shall take the integral of certain functions over R", 
and we use the following notation: 


| foo ax= | fix) dx= | [f° ff (X15 02+ 5X,_) AX, +** dx,. 
Rn —-0O ©. J -@ 


Since our functions will be taken from S, there is no convergence prob- 
lem, because for x sufficiently large, we have for some constant C: 


C 


and we can view the integral as a repeated integral, the order of 
integration being arbitrary. The justification is at the level of elementary 
calculus. Furthermore, we differentiate under the integral sign, using the 
formula 

6 | | 6 

— | K(x, y)dx = | ——K(x, y) dx 

Oy; oy; 
for suitable functions K in situations where this is obviously permissible 
(justification loc. cit.), namely when the partial derivatives of K exist, are 
continuous, and are bounded by an absolutely integrable function of x, 
as in Lemma 2.2. 

We shall also change variables in an integral, but nothing worse than 

the following cases: 


|fo— éx= | fos dx, [f-aax= | foo dx. 


If c > 0, then | fles dx = | f00 dx. 
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The general change of variables formula, of which these are but elemen- 
tary cases, will be proved in detail in Chapter XXI, §2. 
Finally, for normalization purposes, we shall write formally 


1 
d,x = (Qn)? dx. 


This makes some formulas come out more symmetrically at the end. 
We now define the Fourier transform of a function feS by 


f(y) = | fore d, x. 


Remember that xy = x-y. 
Since 


0 . , 
=—f(x)e"” = f(x)(—i)xje"™, 
Oy; 
we see that we can differentiate under the integral sign, and that 


D,f = (—i)(M,f)”. 


Df = (—i)”(M?Pf)*. 


The analogous formula reversing the roles of D? and M? is also true, 


namely: 
M?f = (—i)"(D*f)*. 


By induction, we get 


To see this, we consider 
yfO) = | fooyer dix 
and integrate by parts with respect to the j-th variable first. We let 


u=f(x) and dv=y,e™. 


Then v = ie~” and the term uv between —oo and +00 gives zero contri- 
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bution because f tends to 0 at infinity. Hence 
M, f(y) = (—1) | Dyseyer d,x = (—i)(D,f)*(y). 


Induction now yields our formula. 


Theorem 4.1. The Fourier transform ftof is a linear map of the 
Schwartz space into itself. 


Proof. If f eS, then it is clear that f is bounded, in fact by 


If S | If(%)| dx. 


The expression for M?f in terms of the Fourier transform of D’f, which 
is in S, shows that M’f is bounded, so that f tends rapidly to zero 
at infinity. Similarly, one sees that M ’D4f is bounded, because we let 
g = D"f, g eS, and 


M’D4f _ (—i)"!M?¢g 
is bounded. This proves our theorem. 


For f, g € S we define the convolution 
f*g(x) = | foo — t) dit. 


This integral is obviously absolutely convergent, and the reader will ver- 
ify at once that the map 


(f, giofeg 
is bilinear. Furthermore changing variables shows that 
feg=grf. 
Theorem 4.2. If f, gS, then fg is also in S, and 


D?(f*g) = D?f*g = f*D?g. 
Furthermore, 


(f*g)* = f9. 


Proof. We can differentiate under the integral sign with respect to x, 
and thus obtain the formula for the partial derivatives D?(f*g), which 
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we see exist, are bounded, and are continuous. Now we write 
|x|" S (\x — et] + [e])” 
= Y Ciy |X — t|’ ||" 
where c,, is a fixed integer depending only on m. Then 


xl" |f* g(x)| SY cp [ietisoo |x — t?|g(x — t)| dt 


is bounded, and we conclude that f*g tends rapidly to zero at infinity. 
We can apply the same argument to D’f*g to conclude that f*g lies in 
S. Finally, we have 


(f*g)*(y) = [ure g)(x)e*” d, x 
= | f(t)g(x — the” d,t d,x, 


and we can interchange the order of integration to get 


- | fg (x — He dy x dyt, 


We change variables, letting u = x — t, d,u=d,x and see that our last 
integral is equal to 


[| stogtre ter djud,t= f(y) 90), 


thus proving our theorem. 


Example 1. We recall the value 
| e*'? dx = (2n)"? 


which is obtained first in one variable using polar coordinates. Let 
h(x) = e7*’?, 


Then we contend that h=h. To see this, we differentiate under the inte- 
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gral sign to find, say in one variable, that 


Dh(y) = —yh(y). 


Thus differentiating the quotient hi y)/e»’? yields 0, whence 


h(y) = Ce-”*? 
for some number C. This number is equal to 1, using the evaluation of 
the definite integral recalled above, and our present normalization of the 


Fourier transform, with d,x instead of dx. 


Example 2. Let a be real, a> 0, and for any function he S let 


g(x) = h(ax). 
Then 
1 ~ 
g(y) = —h(y/a). 
a 


This is proved trivially, changing the variable in the integral defining the 
Fourier transform. 


Vill, §5. THE FOURIER INVERSION FORMULA 
If f is a function, we denote by f~ the function such that f~ (x) = f(—~x). 


The reader will immediately verify that the minus operation commutes 
with all the other operations we have introduced so far. For instance: 


(fy =f°)% fer =fo*g, (fy =fog-. 
Note that (f°) =f. 


Theorem 5.1 (Fourier Inversion). For every function feES we have 


f=f. 


Proof. Let g be some function in S. After interchanging integrals, we 
find 


| fone*at d,x -| fee g(x) dyt dx 


= | foae + y) dyt. 
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Let he S and let g(u) = h(au) for a>0. Then 


1» 
g(u) = —htu/ay, 


and hence 


| f(x)e"” h(ax) d,x = | foash(*) d.t 
= \7 au — y)h(u) d,u 


after a change of variables, 


Qu 
— 


' 


n e 


_(t+y) _ 
“=, du= 
a a 


Both integrals depend on a parameter a, and are continuous in a. We let 
a— 0 and find 


h(O)f(») = f(—y) | h(u) du = f(—y)h(0). 
Let h be the function of Example 1. Then Theorem 5.1 follows. 


Theorem 5.2. For every feS there exists a function @ eS such that 
f=. If f, 9g €S, then 
(fg) =f¥9. 
Proof. First, it is clear that applying the roof operation four times to 
a function f gives back f itself. Thus f = @, where go = f***. Now to 


prove the formula, write f= ¢ and g=y. Then f =@o andg=w by 
Theorem 5.1. Furthermore, using Theorem 4.2, we find 


(fg)* = (ob) =(9*W =(eW = WV = feG, 
as was to be shown. 


We introduce the violently convergent hermitian product 


fg = | fo0ats dx. 


We observe that the first step of the proof in Theorem 5.1 yields 


| f(x)g(x) dx = | f(x)G(x) dx 
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by letting y =O on both sides. Furthermore, we have directly from the 
definitions 


f=f- 


where the bar means complex conjugate. 


Theorem 5.3 (Parseval Formula). For f, g ¢ S we have 


KI =<hD 


and hence 


If lls = Wlle- 


Proof. We have 


This proves what we wanted. 


Theorem 5.3 shows that the map fref is an automorphism of S, 
preserving the hermitian product and thus the L?-norm, and thus extends 
to an isometry on L?, since the Schwartz space is dense in L?. 


Vill, §6. THE POISSON SUMMATION FORMULA 


A function g on R” will be called periodic if g(x + k) = g(x) for all ke Z". 
We let T" = R"/Z” be the n-torus. Let g be a periodic C®” function. We 
define its k-th Fourier coefficient for k € Z" by 


C, = | g(x)e~ 277 dx, 
T™ 


The integral on 7” is by definition the n-fold integral with the variables 
(x,,...,X,) ranging from 0 to 1. Integrating by parts d times for any 
integer d>0, and using the fact that the partial derivatives of g are 
bounded, we conclude at once that there is some number C = C(d, g) 
such that for all k e Z” we have |c,| < C/||k||", where ||k|| is the sup norm. 
Hence the Fourier series 


g(x) = y c, e727 
ke Zr 


converges to g uniformly. 
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If f is in the Schwartz space, we normalize its Fourier transform in 
this section by 


f(y) = | f(xje*"” dx 
Rn 
Poisson Summation Formula. Let f be in the Schwartz space. Then 


x fim = Y fm). 


me Z" 
Proof. Let 
g(x) = Y fle +h 


Then g is periodic and C”. If c,, is its m-th Fourier coefficient, then 


dy cm= 90) = Y fk). 


me Z" ke zn 
On the other hand, interchanging a sum and integral, we get 
Cm = | g(xje?™ dx = > f(x + ke?" dx 
T™ k 


eZ? jq—n 


> f(x + k)e~ 27m +h) dx 


ke Z" JTn 


{. f(xje~27'm* dx = f(m). 


This proves the Poisson summation formula. 


Vill, §7. AN EXAMPLE OF FOURIER TRANSFORM 
NOT IN THE SCHWARTZ SPACE 


The Fourier transform of a function is often a complicated object, but to 
deal with applications, all that is frequently needed are estimates on its 
growth behavior. Functions in the Schwartz space provide the simplest 
class of functions for which the Fourier transform behaves in a particu- 
larly simple manner. We give here an example which is more compli- 
cated. Let @ be the characteristic function of the unit disc in the plane, 


that is 
(x) = 1 if |x| <1, 
PO V0. if |x] > 1. 
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Then @ has compact support, and is certainly in Y'. Its Fourier trans- 
form is therefore given by the integral 


@(y) = | e2mxY dy — | e2tix'y dy. 
|x| $1 |x| $1 


This Fourier transform depends only on the distance s = |y|, and if we 
use polar coordinates, then we can rewrite the integral in the form 


1 2n 
6(y) = | [ ois 030 do) dr 
¢) JO 


But the inner integral is a classical Bessel function, namely by definition, 
for any integer n one lets 


1 |” ate ct 
J,(z) _ =| e id+iz sin 6 dé. 


Thus 


1 


@(y) = 2n | Jy(2mrs)r dr. 


0 


As an example of concrete analysis over the reals, we shall estimate the 
Bessel function for z real tending to infinity. 


Proposition 7.1. We have 
J(t<«t'* for to. 


(The sign <« means that the left-hand side is bounded in absolute value 
by a constant times the right-hand side, namely O(t~).) 


Proof. For concreteness, we deal with the case n =0, and we shall 
Just consider a typical integral contributing to J,(t), namely 


\. eit cos @ d0 — ; eitu du ; 
0 - = ./1—u? 
Again typically, we show that 

1 


1 
ei _____ dy = O(t~*”). 
0 1 — yu? 
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We may rewrite the integral in the form 
| a 1 
e' ——g(u) du, 
0 /l—u 


where g(u) = 1/./1 + u is C~ over the interval. Integrating by parts (cf. 
also Lemma 7.3), we see that the desired integral satisfies the bound: 


aeereets 
0 ~—/l—u 


Thus we are reduced to the following lemma. 


« (Ilgil + llg'l]) max 
O<xsl 


Lemma 7.2. Let0<a<b<1. Then uniformly in a, b we have 


du = O(t~"). 


D, 1 
itu 
\- —/l—u 


Proof. Let v= 1-—vu, and then tv=r. Then the integral is estimated 


by the absolute value of 
Bs 1 
mal ane dr, 


A r 


where 0O< ASB. But writing e” =cosr+isinr, and noting that 1/./r 
is monotone decreasing, we see that the integral on the right-hand side is 
uniformly bounded independently of A, B. This proves the lemma, and 
also concludes the proof of the proposition for n = 0. 


The integration by parts shows that the asymptotic behavior of the 
Fourier transform depends only on the singularity. The case just treated 
is typical, and we let the reader handle the proof in general by using the 
next lemma, which shows how the singularity affects the estimate. 


Lemma 7.3. Let [a, b) be a half-open interval. Let f be a continuous 
function on this interval, which is also in £'. Let g be C’ on the closed 
interval [a,b]. Then the Fourier transform satisfies the estimate 


| g(uje™f(u) du « (\Igll + lll) Fill, 


a 


where 


F,(x) = | ef (u) du, 


a 


and ||F|| is the sup norm for x € [a, b]. 
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Proof. This is an immediate consequence of integration by parts. 


Theorem 7.4. Let @ be the characteristic function of the unit disc in the 
plane. Then 


$(y) = O(ly|"*”). 


Proof. As before, let s = |y|. By definition, we have 
| 1 1 1 
— | Jo(rs)r dr = | | cos(urs)(1 — u*)~)? dur dr 
2 O 0 O 
1 r 
[setting ur = t, r du = dt] = | | cos(ts)(1 — (t/r)?)~1? dt dr 
o Jo 


= |. cos(ts) | (1 — (t/r)?)7*? dr dt 
0 t 


1 
[by direct integration] = | cos(ts)(1 — t?)*? dt 
0 


[integration by parts] = dt. 


- |, sine 
S Jo ~/1 — t? 


Estimating this last integral as in Lemmas 7.2 and 7.3 concludes the 
proof. 


Vill, §8. EXERCISES 


1. Show that the Landau functions form a Dirac sequence. 


2. In the case of Fourier series as in Example 2, show that D,»*f is the partial 
sum of the Fourier series of f. Prove that {K,,} as defined in Example 2 is a 
Dirac sequence. (To see this worked out, cf. my Undergraduate Analysis, 
Chapter 12, §3.) 


3. Prove all the facts stated in Example 3, namely: 
(a) The Poisson family is a Dirac family. 
(b) The Poisson family satisfies the Laplace equation, so is harmonic on the 
disc. 
(c) For a continuous periodic function f of 6, the function 


F(r, 0) = F, * f(9) 


satisfies Laplace’s equation, that is AF = 0. You will need to differentiate 
under the integral sign, and whatnot. 


4. Prove the analogous statements for the heat equation of Example 4, replacing 
the words “Poisson” and “Laplace” by “heat” in the preceding exercise, and 
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using a function f of a variable x on R (or R") instead of a periodic function 
of 0. 


5. Prove the statements about the heat family for the Schroedinger operator in 
Example 5. 
The formula for F(x, y, t) can be arrived at naturally as follows. Suppose 
F has the form 


F(x, y, t) = exp(a(t)(x? + y?) + 2b(t)xy + c(t). 


Applying the heat operator, one sees that for F to satisfy the heat equation it 
suffices that the unknown coefficients a, b, c satisfy ordinary differential equa- 
tions in the variable t, with a solution which is given by the elementary 
hyperbolic exponential functions as stated. 


6. For y > 0 define 


(x) 1_ 
x = - 
Py nx? +y? 


(a) Prove that {@,} is a Dirac family, for y > 0 (instead of k — 00). 

(b) Show that as a function of two variables (x, y) = ¢,(x) satisfies Laplace’s 
equation, i.e. g is harmonic on the upper half plane. With this construc- 
tion, we get harmonic functions on the upper half plane having given 
bounded boundary value on the real line. 

The Dirac family of this example will be used in Chapter XX, §2 in 
connection with functional analysis. 


7. Although Dirac families cover a lot of territory, they are not universally 
applicable. At the most basic level, for Fourier series, one still wants to know 
conditions under which the ordinary Fourier series D,* f converges to the 
function f. We give here an example of an object which satisfies the condi- 
tions of a Dirac family except for the positivity, on the circle, so for periodic 
functions. 

For t>0 and x eR, let 


0,(x) = O(x, )= YL ert te2nine, 


n>=—o 


(a) Show that @ satisfies the conditions of a Dirac family, except for the 
positivity condition. Note that 6, is periodic in the variable x. 

(b) For f continuous periodic, show that 6,* f converges to f uniformly as 
t 0. 

(c) Show that 6 satisfies the heat equation, and so does 6, * f(x), as a function 
of (x, t). The heat equation is normalized here in the form 


100 30 


8. Let fe £'(R) 7 L7(R) and suppose the function xf*(x) is in #'(R). Prove 
that there is a C' function g such that g = f almost everywhere, and give a 
formula for g. 
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9. 


10. 


11. 
12. 


Let A >0 and let fe #'(R). Define 


filo) = Te | ‘ fr (edt, 


1{,°* in A(x — 
f(x) = =| f(y Ae YD) dy. 
Tt | _~ x—-—y 


Prove that 


Prove that there is a function h in the Schwartz space such that h* has 
compact support and h(0) #0. Show that for such a function, h, =h for all 
A sufficiently large (notation as in Exercise 9). 


Let fe #'(R). Prove that f* is uniformly continuous on R. 


The lattice point problem. Let N(R) be the number of lattice points (that 1s, 
elements of Z?) in the closed disc of radius R in the plane. A famous 
conjecture asserts that 


N(R) = xR? + O(R!2**) 


for every ¢ > 0. It is known that the error term cannot be O(R’(log R)*) for 
any positive integer k (result of Hardy and Landau). Prove the following 
best-known result of Sierpinski-Van der Corput—Vinogradov—Hua: 


N(R) = 2R? + O(R??). 
Sketch of Proof. Let @ be the characteristic function of the unit disc, and put 
r(x) = o(x/R). 


Let w be a C® function with compact support, positive, and such that 


| W(x) dx = 1. 
R2 
Let 

W(x) = & 7 p(x/e). 


Then {y,} is a Dirac family for ¢ +0, and we can apply the Poisson summa- 
tion formula to the convolution @p*w, to get 


Y ore l,(m) = > dg(m),(m). 
me Z2 me Z2 


=2R?+ > 2R?O(Rm)b(em). 
m #0 


We shall choose ¢ depending on R to make the error term best possible. 
Note that @p*w,(x) = Or(x) if dist(x, Sp) > 8, where Sp is the circle of 
radius R. Therefore we get an estimate 


\left-hand side — N(R)| <« éR. 
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Splitting off the term with m=0 on the right-hand side, we find (using 
Theorem 7.4): 


y. R2@(Rm)b(em) « R232, Yo |m|-3? Wem). 
0 m #0 


m # 


But we can compare this last sum with the integral 


" r~32.h(er)r dr = O(e 7). 


1 


Therefore we find 
N(R) = 2R? + O(eR) + O(R"7 e717), 


We choose ¢ = R™' to make the error term O(R7°), as desired. 
For relations of the lattice point problem to the eigenvalue problem see 
Guillemin [Gut]. 


CHAPTER IX 


Integration and Measures on 
Locally Compact Spaces 


On a locally compact space, it is as natural to deal with continuous 
functions having compact support as it is natural to deal with step 
functions. Thus we must establish the relations which exist between 
functionals on the former or the latter. As we shall see, they essentially 
amount to the same thing. 

Thus the main point of this chapter, is to see how one can associate a 
measure to a functional on C,(X). Applications will be given in Chapter 
X and in the spectral theorem of Chapter XX. The measure derived from 
that situation is called a spectral measure. 

Specializing to euclidean spaces, we relate integration with differentia- 
tion, using the infinitely differentiable functions and partial derivatives 
to define distributions, generalizing the notion of measure. There is no 
question here of going deeply into this theory, but only of showing 
readers how it arises naturally, and of making it easier for them to read 
standard treatises devoted to the subject. 

If the locally compact space is a locally compact group, then one can 
ask for the existence of an integral and a positive measure which are 
invariant under left translations. This is dealt with in Chapter XII. 

Both in this chapter and in Chapter XI, we prove the existence of 
partitions of unity (in the locally compact and locally euclidean cases, 
respectively). Strictly speaking, this is a tool belonging to general topol- 
ogy, but we postponed dealing with it until it was needed. Such parti- 
tions are used to glue together certain maps into a vector space, given 
iocally. They are thus used to reduce certain types of global questions to 
local ones. 
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Throughout this chapter we let X be a locally compact Hausdorff space. 


IX, §1. POSITIVE AND BOUNDED 
FUNCTIONALS ON C.(X) 


We denote by C,(X) the vector space of continuous functions on X with 
compact support (i.c. vanishing outside a compact set). We write C,(X, R) 
or C,(X, C) if we wish to distinguish between the real or complex valued 
functions. 

We do not give formally a topology to C,(X), but observe that there 
are two natural ones. Of course, we always have the sup norm, defined 
on C,(X) since every function is bounded, vanishing outside a compact 
set. 

The other topology would come from considering the subspaces C(K) 
for each compact subset K of X, and observing that C,(X) is the union 
of all C(K) for all K. One can then give C,(X) a topology called the 
inductive limit of the topologies coming from the sup norms on each 
subspace C(K). We do not go into this here, but we make additional 
remarks at the end of §4. 

We denote by C,(X) the subspace of C,(X) consisting of those func- 
tions which vanish outside K. (Same notation C,(X) for those functions 
which are Q outside any subset S of X. Most of the time, the useful 
subsets in this context are the compact subsets K.) 

A linear map / of C,(X) into the complex numbers (or into a normed 
vector space, for that matter) is said to be bounded if there exists some 
C = 0 such that we have 


lAfl = Cif 


for all f € C.(X). Thus 4 is bounded if and only if A is continuous for the 
norm topology. 

A linear map 4 of C,(X) into the complex numbers is said to be 
positive if we have Af = 0 whenever f is real and = 0. 


Lemma 1.1. Let 4: C,(X)—~C be a positive linear map, Then 4 is 
bounded on C,(X) for any compact K. 


Proof. By the corollary of Urysohn’s lemma, there exists a continuous 
real function g 20 on X which is 1 on K and has compact support. If 
fec,(X), let b=|f|. Say fis real. Then bg + f = 0, whence 

A(bg) + Af 20 


and |Af| < bA(g). Thus Ag is our desired bound. 
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A complex valued linear map on C,(X) which is bounded on each 
subspace C,(X) for every compact K will be called a C,-functional on 
C,(X). In accordance with a previous definition, a functional on C,(X) 
which is also continuous for the sup norm will be called a bounded 
functional. It is clear that a bounded functional is also a C,-functional. 


Theorem 1.2. Let 4 be a bounded real functional on C,(X,R). Then A 
is expressible as the difference of two positive bounded functionals. 


Proof. If f 20 is in C,(X), define 
A*f =sup dg for O<gsf and geEC{(X,R). 
Then A*f => 0 and ATS S|A\ || f ||. Let ce R, c>0. Then 
A*(cf)=supA(cg) for OSg8f, 


whence A*(cf) = cA*(f). Let f,, f2 be functions 2 0 in C,(X,R). Then 
taking O< g, Sf, and O0<g, </f,, we have 
A*f, + Atf, = sup dg, + sup Ag, 
= sup(Ag, + Ag2) = sup A(g, + 92) 
SA*(f, + fr). 


Conversely, let O<g</f, + f,. Then 0 S inf(/,, 9) Sf, and 


O0<g—inf(fi,9) Sho. 
Hence 


Ag = A(inf(f,, 9)) + A(g — inf(f,, 9) 
SMA +MA. 
Taking the sup on the left implies that A*(f, + f2.) S47, + Ah, thus 
proving that 4* is additive. 


We extend the definition of 4* to all elements of C,(X, R) by expres- 
sing an arbitrary f as a difference 


f=h—h, 
where f,, f, 2 O and letting 


A*f - af; ~~ A* fy. 
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& 


The additivity of A* on functions = 0 implies at once that this is well 
defined, i.e. independent of the expression of f as a difference of positive 
functions. One then sees at once that this extension of A" is linear. If 
f =0, then A*f = 0 and also A*f = Af. We now define A~ by 


Af =A f —Af. 
Then A~ is linear, and 


A=1* 2. 


Furthermore, both 4* and 4~ are positive. Finally it is verified at once 
that A* and A~ are bounded, thus proving our theorem. 


Note on Terminology. When dealing exclusively with Banach spaces, 
as was the case until this section, we used the word functional to apply 
to linear maps into the scalars, continuous with respect to the given 
norm. In dealing with C,(X), we shall usually say functional instead of 
C.-functional as defined above, and use an adjective (positive, bounded) 
to describe any additional properties that such a linear map may have. 

A positive functional satisfies a strong continuity property with respect 
to increasing or decreasing sequences of continuous functions. 


Theorem 1.3 (Dini). Let feC.(X,R) be = 0, and let {f,} be a se- 
quence of positive functions in C,(X) which is increasing to f. Then 
{f,} converges to f uniformly. More generally, let ® be a family of 
positive functions in C,(X, R) which are < f, and such that 


sup 9 = f. 


ge®D 


Assume that if o, We ®, then sup(g, w)e ®. Given «, there exists op € ®D 
such that ||f — o|| < «. If 4 is a positive functional on C,(X), then 


Af = sup Ag. 


ge®D 


Proof. The assertion concerning the sequence is a special case of the 
assertion concerning the family. We prove the latter. Let f vanish out- 
side the compact set K. For each x eK, we can find a function 9, €® 
such that 


I(x) — @(x) < €. 


Then there is some open neighborhood V, of x such that 


f(iy)- ex(y)<e forall yeV,. 
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If we cover K by a finite number of such neighborhoods V, (i = 1, ...,n), 
and let 


P = SuP(Y,,,---,Px,)s 


then ||f — g|| <«. This proves the first part of the theorem. The last 
assertion follows by the continuity of 4 (Lemma 1.1). 


IX, §2. POSITIVE FUNCTIONALS AS INTEGRALS 


The main result of the chapter is to interpret a functional on C,(X) as an 
integral. Let .@ be the algebra of Borel sets in X. If pw is a positive 
measure on .@ which is finite on compact sets, then p gives rise to a 
positive functional, denoted by du, and given by 


Cf, du) = [fae 


We shall prove the converse (Riesz’ theorem), and first obtain a positive 
measure from a positive functional. 

If f is a function on X, we define its support to be the closure of the 
set of all x such that f(x) #0. Thus the support is a closed set. We 
denote it by supp(/). 

We use the following notation as in Rudin [Ru 1], which we more or 
less follow for the proof of Theorem 2.3. If V is open, we write 


f<vV 


to mean that f is real, fe C,(X), OS f <1, and supp(f) < V. Similarly, 
if K is compact we write 


K~<f 
to mean yx Sf <1, and of course f € C,(X). 


Lemma 2.1. Given K compact, K < V open, there exists some f such 
that | 


K<f<V. 


Proof. This is an immediate consequence of Urysohn’s lemma. All we 
have to do is choose some open set W with compact closure W such 
that K cWcWcYV, and use the normality of W to find a function f 
with 0 < f <1 which is 1 on K and 0 on the boundary of W. We then 


extend this function to be 0 outside W. This extension is continuous on 
all of X. 


256 LOCALLY COMPACT SPACES [ITX, §2] 


Lemma 2.2. If V is open, then we have 


xy=supf for fx<V. 


Proof. Given x é€V, there exists an open neighborhood W of x such 
that xe WeaWc V, and such that W is compact. We can find a func- 
tion f with O< f <1 such that f(x)=1 and f is 0 on the complement 
of W, by the corollary of Urysohn’s lemma. This proves our assertion. 


Theorem 2.3 (Riesz Theorem, Part 1). Let 4 be a positive functional on 
C.(X). There exists a unique positive Borel measure satisfying conditions 
(i) and (11) below, and this measure also satisfies (iii) and (iv). 


(i) If V is open, then 
u(V) = sup Ag for g<V. 
(ii) If A is a Borel set, then 
u(A) = inf u(V) for V open > A. 


(iii) If K is compact, then p(K) is finite. 
(iv) If A is a Borel set and A is o-finite, or A is open, then 


p(A) = sup “(K) for K compact c A. 


Remark 1. From (ii) and the remarks before Theorem 2.3, we see at 
once that for any compact K we have 


u(K) = inf Af for K<f. 


Remark 2. The uniqueness of yp satisfying (i) and (ii) is obvious, be- 
cause (i) determines on open sets, and (11) determines yp on all Borel 
sets. 


Remark 3. It is convenient to introduce a word to summarize the 
main properties listed in Theorem 2.2. We shall say that a positive 
measure on a locally compact space, defined on a o-algebra .@ con- 
taining the Borel sets, is o-regular if it satisfies properties (ii), (ii), and (iv) 
of Theorem 2.2 for the sets of .@. Even though X itself need not be 
o-finite, in applications only o-finite sets arise because for instance any 
function in #' is equivalent to a function vanishing outside a o-finite set. 

Let .@ be a o-algebra containing the Borel sets. A positive measure 
on .@ 1s said to be regular if u(K) is finite for all compact K, and if in 
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addition, for all Ae .@ we have 


u(A) = inf n(V) for V open > A, 
p(A) = sup w(K) for K compact c A. 


Cases when a measure is not regular are to be regarded as pathological. 
Our definitions are adjusted so that the following statement is merely a 
rephrasing of parts of our definitions: 


If X is o-finite and yp is o-regular, then pu is regular. 


Note that the property that u(V) = sup u(K) for compact K c V is satis- 
fied by open V. This is convenient because even in pathological situa- 
tions, we are able to define the measure of Theorem 2.3 on Borel sets 
rather than on a more restricted algebra (e.g. that generated by the 
compact sets, as is sometimes done in the literature). Observe that if we 
know that property (iv) is satisfied by all sets A of finite measure, then it 
follows at once for any o-finite A. Indeed, let {A,} be a disjoint sequence 
of sets of finite measure, and let K, be a compact subset of A, such that 
u(A, — K,) < 6/2”. Then K,U--:UK, is compact, and w(K, U:::UK,) 
tends to the measure of |) A, as n— oo, whether this measure is finite or 
not. 


For an example of pathology, let I be a non-countable set of indices, 
and let {X;} (ie 1) be disjoint copies of the interval [0,1]. Let x; be a 
point in X;,, and let S be the union of all x;. Then S is discrete, and has 
infinite measure, but all compact subsets of S are finite and have measure 
0. 


We now come to the proof of Theorem 2.3. 


We shall actually define uw on a larger algebra than that of the Borel 
sets, more or less the largest algebra such that the measure still satis- 
fies our four conditions. For instance, it is clear that the complete mea- 
sure obtained from a measure satisfying our properties still satisfies these 
properties. 


Until the end of the proof of Theorem 2.6 we let f, g, h, denote 
elements of C,(X) which are real = 0. 


Lemma 2.4. Let 4 be a positive functional on C,(X). For each open set 
V, define 


wKV)=supi4g = for g<V. 
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For any subset Y of X define 
u(Y) = inf p(V) for YcV. 
Then p is an outer measure on the algebra of all subsets of X. 


Proof. It is clear that u(@)=0 and that if A c B, then y(A) < p(B). 
For convenience of notation, we write 


sup(f,g)s=fug and inf(ifig)=fog. 


We prove that if V,, V, are open, then 


(1) WV, UVa) S ws) + WV). 


Let h<V,UV,. Let ® be the family of all functions sup(g,,g9,) with 
9; < V, for i= 1, 2. Then ® is closed under the sup operation (on a finite 
number of elements), and we have 


SUD 9 = Xv, uv;- 
ge®D 


Let ®, be the family of all functions 
AN(91:V92)  G:<Vi, i=1, 2. 
Then h is the sup of all functions in ®,, whence by Theorem 1.3 


Ah = sup A(hA(g, Ugz)) 


91592 


< sup A(hag, +hong;) 


91°92 
S WV,) + (V2). 
Taking the sup over all h on the left yields our inequality (1). 
Now let {A,} be a sequence of subsets of X, and A=|) A,. Let V, 
be open, A, < V,, and 


u(V,) < u(A,) + a 


Let V=\|)V,. Then |) 4,c\()V=V. Let g<V. Since g has compact 
support, there is some n such that 


g<V,u°:'UV,, 
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whence by (1) and induction, 
Ag S WV) +++ + W,). 
Taking the sup over all g on the left yields 
u(A) S u(V) S > MA,) + «. 

This proves our lemma. 

Remark. The special role played by compact and open sets in con- 
structing the algebra of measurable sets and the measure on it will stem 
from the following property: 


If K<f<V, then w(K) S Af S pV). 


The right inequality is obvious. As for the left one, let W be the set of x 
such that f(x) >1—eé. Then K c W. Let g < W be such that 


UW) SAg + €. 
We have 
(l-—eg<f whence (1 —e)Ag S Af. 
Hence 
Af 
WK) SuW)Sig+es— +6 


This implies that u(K) < Af, as contended. 


Our remark shows the main idea of what follows. We recover charac- 
teristic functions of certain sets by squeezing them between compact and 
Open sets, and comparing them with functions feéC,(X) on which the 
given functional is defined. The K and V allow us to use the old 
technique of lower and upper sums respectively. We first have to recover 
the measure itself, however, and we proceed to do this. For convenience 
of notation, the outer measure p described in Lemma 2.4 will be called 
the outer measured determined by 2. 


Lemma 2.5. Let «& be the collection of all subsets A of X such that 
p(A) < oo, and 


u(A) = sup w(K) for K compact c A. 


Then s is an algebra containing all compact sets and all open sets of 


260 LOCALLY COMPACT SPACES [IX, §2] 


finite measure. Furthermore, p is a positive measure on %. In fact, if 
{A,\ is a disjoint sequence of elements of , and A =\) A,, then 


u(A) = >) H(A, ). 
If in addition p(A) < o, then AE &. 


Proof. If K is compact, then there exists an open V containing K 
such that V is compact. Let V<g. For any f<V we have f <g, 
whence Af <dAg and pw(V) < Ag, so that w(K) < dg is finite. It is then 
clear that K € a. 

Let V be open. We shall prove that 


u(V) = sup y(K) for K compact, K c V. 


We may assume that u(V) > 0. To cover the case when p(V) = 00, we let 
r be a real number such that 0 <r < _p(V). There exists f such that 


r<AfspV). 


Let K be the support of f. If W is an open set containing K, then 
f < W, whence 
r<Af s u(W), 


and therefore r < u(K). This proves that u(V) = sup w(K) for K compact 
c< V. In particular, if u(V) is finite, then V € &. 

Before proving that . is an algebra, we find it convenient to have the 
finite additivity. Actually, it is no more troublesome to prove the count- 
able additivity. First we prove that if K,, K, are disjoint and compact, 
then 

MK, UK) = W(K,) + (K2). 


Let V,, V; be disjoint open sets containing K,, K, respectively. Let W 
be open such that 


UW) Ss w(K, UK) +6. 
Let g;< Wo V, be such that for i = 1, 2 we have 


MW Vi) S Ag; + €. 
Then 
HW(K,) + (Kz) S WO V,) + WW V2) 


<SAg, + Ag, + 2¢ 
= A(g, + 92) + 28 
< p(W) + 2e S w(K, UK,) + 3e. 
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The reverse inequality is true because p is an outer measure, so we get 
the desired equality. 

Given a sequence of disjoint sets {A,} in &, let K, < A, be compact, 
such that 


H(A,) S WK,) + 55. 


Let A =|) A,. Then for all n, 


= 


Ms 


H(A;) S HK) +e = MB, UB) +e 


i=1 


i 


S p(A) + «. 


Letting n tend to infinity, and then e¢ tend to 0, together with the fact 
that yw is an outer measure, shows that 


8 


u(A,) = BA). 


n 


This gives the countable additivity and also proves that Ae o& if p(A) Is 
finite. 

We can now prove that © is an algebra. Clearly the empty set is in 
A. If A,, A, € PM, we can find compact sets K,, K, and open sets V,, 
V, such that 


K,;c A,;c V, (i = 1, 2) 
and such that for i = 1, 2 we have 
M(K;) S MA;) S H(V;) < w(K;) — €. 
In particular, by the finite additivity of u, we have 


UV, — Ki) <. 
Since 
(V, UV,) —(K, UU K,) ((V, — K,)U(V, — Kz), 
we get 
u(A, UV A,) < w(K, UK,) + 2e. 


It follows that A, UA, lies in o&. Next we note that K, — V, 1s com- 
pact, V; — K, is open, and 


(K, — V,) < (A, — A,) c (VY, — Ky). 
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The difference of the two extreme sets satisfies 
(V; — K,)—(K, —V,) (WV, — K,)UV, — K2) 
so that A, — A, lies in s. Since we can write 
A, NA, = A, — (A, — A;) 


it follows that A, 7 A, hes in &, thus showing that . is an algebra, and 
proving our lemma. 


Theorem 2.6. Let 4 be a positive functional on C,(X). Let pw be the 
outer measure determined by A, and let o& be the algebra of all sets A 
of finite measure such that 


u(A) = sup u(K) for K compact < A. 


Let & be the collection of all subsets Y of X such that YOK lies in 
SS for all compact K. Then @ is a o-algebra containing the Borel sets, 
and yu is a positive measure on ™. Furthermore, & consists of the sets 
of finite measure in 1%. 


Proof. It is clear that a <M. Let @, as usual denote the collection 
of all sets YOK with Ye.@. Then 4, = &%,, and is therefore a o- 
algebra in K for each compact K, by Lemma 2.5. It follows immediately 
that .@ itself is a o-algebra, because the operations of countable union, 
intersection, and complementation in X commute with the operation of 
intersecting with K. (Cf. Lemma 6.2 of Chapter VI where we met a 
similar situation.) 

That .@ contains all closed sets is obvious because if Y is closed and 
K compact, then YOK is compact and so lies in &. Therefore .@ 
contains the Borel sets. 

Let A be of finite measure in .@. Let V be open containing A, and of 
finite measure. Let K be compact c V such that 


L(V) < w(K) + «. 
Since A K lies in »&, there is some compact K’ c Aq K such that 
u(A mK) < w(K’) + «. 
But A < (AQ K)U(V — BK), so that 


MA) S MAO K) + WV — K) S H(K’) + 22. 
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This proves that A lies in »/, and therefore that « is precisely the 
algebra of sets of finite measure in .Z. 

Finally, let {A,} be a disjoint sequence in .@. If some A, has infinite 
measure, the countable additivity of w on |) A, is clear. If all A, have 
finite measure, then Lemma 2.5 applies. This proves our theorem. 


The measure of Theorem 2.3 (or Theorem 2.6) will be called the 
associated measure of /, or the measure determined by 4. In applications, 
one needs it mainly on the Borel sets (or the completion of the Borel 
sets). 


We now wish to prove that the functional 14 is given by the integral. 
First we note that if w is a o-regular measure and f €C,(X), then f is in 
(yu). Indeed, f being continuous implies that f is measurable. Also f 
vanishes outside a compact set (so of finite measure), and is bounded on 
that set, and hence f is in #'(p), say by Corollary 5.9 of the dominated 
convergence theorem (Chapter VI). Next we need a lemma. 


Partitions of Unity. Let K be compact and let {U,,...,U,} be an open 
covering of K. There exist functions f, (i=1,...,n) such that f,< U; 
and such that 


Y f(x)=1, allxeK. 

i=1 
__ Proof. For each x € K let W, be an open neighborhood of x such that 
W, < Ux, for some index i(x). We can cover K by a finite number of 
open sets W,, ...,W,. Let V, be the union of all open sets W,, such 


that W.., < U;. Then {V,,...,V,} is an open covering of K. Furthermore 
V,< U;. Let g; be a function such that 


Vi<Gi< U,. 
Let 
fi = 91> 


fo = 92(1 — 91), 


Sn = Gn — 91) °°* = Gn-1)- 


Then f, < U;, and by induction one sees at once that 


fptocot+h=1—-(U—g1)°°* CU — gn). 


From this our condition Y f(x) = 1 for x € K follows at once. 
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The functions { f;! are said to form a partition of unity over K, subor- 
dinate to the covering {U,,...,U,}. 


Theorem 2.7 (Riesz Theorem, Part 2). Let 4 be a positive functional 
on CX), and let uw be the Borel measure determined by A. For all 
f €C(X) we have 


f= | Sd 


Proof. It suffices to prove our statement when f is real, ||f|| 40. It 
will also suffice to prove the inequality 


ifs | fd 


(the reverse inequality following by considering —f instead of f). Let K 
be the support of ff. Given ¢, which we may assume < ||f||, we can find 
a partition {A,,...,A,} of K by measurable sets, a step function 


such that f < @ < f +, and open sets V, > A; such that 


a 
nif 


and also that f <c; on V,;. [For instance, cut an interval containing the 
image of f into ¢/2-subintervals, say half closed to make them disjoint, 
and let c; be the right end point of each subinterval. Let A; be the 
inverse image in K of the i-th subinterval. Let c; = c; + ¢/2. For each i 
let W, be open > A; such that f <c; on W,, and shrink W, to an open 
V, > A; satisfying (*). | 

Let {h,,...,h,} be a partition of unity over K_ subordinate to 
{V,,...V,}. Then fh; has support in V,, and fh; <c,;h;. Furthermore, 
K < inf(1, ¥° h,), so that 


(*) u(V;,) S p(A;) + 


u(K) S AY; h;) = Y Ah;. 


Let c = max|c,|. Then c < f +e. We have 
af = ¥ Ath) < Y Aeih) 


= ¥ ¢Ah, = ¥ (ce, + ah, — Ah, 
i=1 
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SY (e+ uh) — eulK) 

SY + of may + 7 aa — cu(K) 
< [. gp du + 4e + cu(K) — cu(K) 

< | fa + eu(K) + 4¢. 


This proves our inequality since the integral of f over K is the same as 
the integral of f over X, and concludes the proof of our theorem. 


Corollary 2.8. Let M, be the set of o-regular positive Borel measures 
on X. The map 


pre du 


is an additive bijection between M, and the set of positive functionals on 
C(X). 


Proof. Theorems 2.3 and 2.7 show that the map wt dy is surjective. 
Let u,, LU, be positive measures satisfying conditions (ii), (i1i), and (iv) of 
Theorem 2.3 and assume that du, =du,. To show that yw, =p, it 
suffices to prove that the two measures coincide on compact sets, because 
then (iv) shows that they coincide on open sets, and (ii) shows that they 
are equal on Borel sets. Let K be compact, and let V be an open set 
containing K such that 


H2(V) < U2(K) + €. 


Let K<f~<V. Then xy, < f < xy, whence 


W,(K) Ss [fda = | fai S u,(V) S w(K) + 


This proves one inequality, and the other follows by symmetry. Thus we 
get a bijection between M, and the set of positive functionals on C,(X). 
This bijection is obviously additive. This proves our corollary. 
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Theorem 3.1. Let u be a positive o-regular Borel measure on X. Then 
C.(X) is dense in L?(u) for 1 S p< o. 
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Proof. The step functions are dense in L?(u), and it thus suffices to 
prove that for any set A, of finite measure given ¢ we can find some 
f € C(X) such that 


Xa — fllp <& 


We take K compact, V open such that K c Ac YV, and u(V) < w(K) + 6. 
Let K<f~<V. Then 


WK) S { fduswV) 
and 


IX 7 fill, s ll X4 ~ XK\lp + If ~ XK\lp < gt /P + gt !P, 
thus proving our assertion. 


Corollary 3.2. If fe¥'(u) and | fodu=0 for all geC,(X), then 
f =0 almost everywhere. 


Proof. Let A be a set of finite measure. Then y, is the L'-limit of a 
sequence {g,} in C,(X) with 0< 9, <1. Taking a subsequence if neces- 
sary, we may assume, by Theorem 5.2 of Chapter VI, that {@,} con- 
verges to x, almost everywhere, and thus { fg,} converges to fy, almost 
everywhere. By the dominated convergence theorem, we conclude that 
{ fx4 du = 0, and Corollary 5.16 of Chapter VI finishes the proof. 


The next theorem shows that a measurable function is almost continu- 
ous, on a set of finite measure. 


Theorem 3.3 (Lusin’s Theorem). Let pu be a positive o-regular Borel 
measure on X. Let f be a complex measurable function on X, and 
assume that there exists a set A of finite measure such that f is equal 
to 0 outside A. Given e, there exists ge CX) and a measurable set Z 
with p(Z) < e such that f(x) = g(x) for x ¢ X — Z. Furthermore, we can 
select g such that |\g|| S ||f|| (sup norm). 


Proof. Let A, be the set where | f| 2 n. Since the intersection of all A, 
is empty, it follows that the measures y(A,) approach 0. Excluding a set 
of small measure, we suppose that f is bounded. 

In this case, f is in #*(u, C). By Theorem 3.1 there exists a sequence 
{g,} in CX) which is L*-convergent to f. Taking a subsequence if 
necessary, and using Theorem 5.2 of Chapter VI, we may assume that 
there is a set Z with u(Z)<e such that the convergence is uniform 
outside Z. By regularity, we can find a compact set K contained in 
A — Z such that u(A — K) < 2e. The convergence of {g,} is uniform on 


[ IX, §4] BOUNDED FUNCTIONALS AS INTEGRALS 267 


K, and hence the restriction of f to K is a continuous function g on K. 
Let V be open > K such that V is compact. By Theorem 4.4 of Chapter 
2 (Tietze extension theorem) we can find a continuous function g* which 
is equal to g on K and 0 on the boundary of V. We extend g* to all of 
X by giving g* the value 0 outside V. Then g* is equal to f on K and is 
in C.(X). 

This leaves only the last statement, that we can manage ||g|| < || fl. 
Let b=||f||. Let h be the function such that h(z)=2z if jz} <b and 
h(z) = bz/|z| if |z| > b. Then h is continuous, ||h|| < b, and ho g* fulfills 
our requirements, thus proving Lusin’s theorem. 


IX, §4. BOUNDED FUNCTIONALS AS INTEGRALS 


Let m be a complex valued measure on the Borel sets of X. We shall say 
that m is regular if |m| is regular. See Exercise 7 for examples. Recall 
that for complex measures, |m| is always bounded, by Theorem 3.3 of 
Chapter VII, and that we can define the norm ||m|| = |m|(X). 


Theorem 4.1. The complex regular Borel measures on X form a Banach 
space. 


Proof. We leave most of the proof as an exercise. We shall just prove 
that if m,, m, are regular, then m, + m, 1s regular. Indeed, we have 


[m, + m,| S|m,| + |m)]. 
For any Borel set A we select K compact in A such that 
|m,|(A — K) <é and |m,|(A — K) <e. 


Then |m|(A — K) < 2e. Similarly for open sets, whence m, +m, is 
regular. 


We wish to interpret regular Borel measures as bounded functionals 
on C,(X). The easiest way at this point is to use the Radon—Nikodym 
theorem, and write 


dm = hd|m| 


for some he #}(|m|, C), with |h| = 1 (Theorem 3.5 of Chapter VII). Thus 
by definition, for f € C.(X), we define 


<f, dm> = | fh d\m|. 
Xx 
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Let us denote by M,(X,C) = M, the Banach space of complex regular 
Borel measures on X. The map 


mr>dm 


is then a linear map of M, into the dual space of C,(X) (sup norm), 
because we have the inequality 


|dm| S ||m|, 


| f dm 
x 


Theorem 4.2 (Riesz Theorem, Part 3). The map m+->dm is a norm- 
preserving isomorphism between the space of regular complex Borel mea- 
sures on X and the dual space of C,(X) (with sup norm topology). 


or written out explicitly, 


S [FI Imi. 


In fact: 


Proof. Our map is obviously linear. To show that it is surjective, we 
view any bounded functional 4 as a functional on C,(X,R) and then 
decompose A into its real and imaginary parts, say 4 = o + it, where o, t 
are then bounded functionals. We express each real bounded functional 
as a difference of positive functionals using Theorem 1.2, and apply part 
1 of the Riesz theorem to these positive functionals to represent them by 
positive measures. If x is a positive bounded functional and yw is the 
measure which represents x by Theorem 2.3, then p(X) < oo. To see this, 
note that by condition (iv) of this theorem, we have 


u(X) = sup w(K) for K compact. 


If K < f, we must have 


w(K) S | fu = mfsCllfll=C 


where C = |x|, so that in fact u(X) <|z|. By definition and the other 
conditions of Theorem 2.3, we conclude that mw is regular. If u; with 
i= 1,...,4, are the bounded regular positive measures representing o*, a, 
t,t respectively, then the complex measure 


mM = My — 2 + i(M3 — M4) 


is regular and represents A, i.e. we have 4 = dm, thus proving that our 
map 1s surjective. 
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To show that the map is injective, we have to prove that its kernel is 
0. Suppose that dm =0. Let uw =|m| and dm=hdy with |h| =1. Then 
<f, hd, = 0 for all fe C,(X). But C,(X) is L'-dense in #*(y, C) by Theo- 
rem 3.1. We have the inequality 


Kf AD yl S WAM Tl 


for all fe Y*(u, C). It follows that <@, h>, =0 for all step functions ¢, 
whence h is equal to 0 almost everywhere. Since |h| = 1 we must have 
u(X) = 0, thus proving m = 0. 

Finally, write again dm =hdy with wp =|m| and |h| =1. Let 4 = dm. 
We have to show that p(X) < |A|. By Lusin’s theorem, §3, we can find a 
function g € C.(X) such that g = h except on a set Z of measure < e, and 
such that |g| < 1 on Z. Consequently 


|A| 2 |Ag| = 


| gh i 2 U(X — Z) — u(Z) 
= u(X) — 2e. 


This proves the desired inequality, and concludes the proof of the 
theorem. 


Remark. Let 4 be a C,-functional on C,(X). For each compact subset 
K, the restriction of the functional to C,(X) is a bounded functional, 
which has a corresponding measure py by Theorem 4.2. If K, < K, are 
two compact sets, then it is easily verified that the restriction of px, to 
K, is ux,. If A is Borel-measurable and A c K, then we define 


11,(A) = Ux(A), 


which does not depend on the choice of K. This function y,, defined on 
Borel-measurable subsets A of compact sets, will also be called a mea- 
sure, and more specifically the measure associated with 4. For instance, 
suppose pt iS a positive Borel measure on X, and f is a measurable 
function, bounded on each compact set, then A= f du defines such a 
functional, which has such an associated measure. The measure p, could 
also be called the direct limit of the measures y,, taken over all compact 
sets K. 


IX, §5. LOCALIZATION OF A MEASURE 
AND OF THE INTEGRAL 


The introduction of partitions of unity in §2 is not as accidental as it 
seems. We can use them to localize a measure, or functional. 
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Theorem 5.1. Let {W,} be an open covering of X. For each index a, 
let 4, be a functional on C,(W,). Assume that for each pair of indices a, 
B the functionals A, and A, are equal on CW, W,). Then there exists 
a unique functional 1 on X whose retriction to each C,(W,) is equal to 
A, Uf each i, is positive, then so is A. 


Proof. Let f¢C,(X) and let K be the support of f. Let {h;} be a 
partition of unity over K subordinated to a covering of K by a finite 
number of the open sets W,. Then each h,f has support in some W,;;) 
and we define 


Af — » Agi hif ). 


We contend that this sum is independent of the choice of a(i), and also 
of the choice of partition of unity. Once this is proved, it is then obvious 
(see Exercise 10) that A is a functional which satisfies our requirements. 
We now prove this independence. First note that if W,,;, is another one 
of the open sets W, in which the support of h,f is contained, then h,f has 
support in the intersection W,;, © W,,), and our assumption concerning 
our functionals 2, shows that the corresponding term in the sum does 
not depend on the choice of index a(i). Next, let {g,} be another parti- 
tion of unity over K subordinated to some covering of K by a finite 
number of the open sets W,. Then for each i, 


hif = » ghif, 


whence 


» Ayihif) = » 2, Ais Guhif ). 


If the support of g,h;f is in some W,, then the value /,(g,h,f) is inde- 
pendent of the choice of index a. The expression on the right is then 
symmetric with respect to our two partitions of unity, whence our theo- 
rem follows. 


Corollary 5.2. Let {W,} be an open covering of X. For each index «a, 
let u, be a positive o-regular measure on W,. Assume that for each 
pair of indices a, B the measures pp, and jg induce equal measures on 
W,0W,;. Then there exists a unique o-regular positive measure on 
X whose restriction to each W, is equal to p,. 


Proof. This is merely a rewording of the theorem, in view of the 
correspondence between o-regular measures and positive functionals. 


Theorem 5.1 will be used only in the proof of Stokes’ theorem in 
Chapter XXIII. 
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In §2 and in Theorem 5.1, we dealt with partitions of unity over a 
compact subset of X. We shall now discuss partitions of unity over all 
of X. 

Let &% be a covering of X, say by open sets. We say that % is locally 
finite if every point of x has a neighborhood which intersects only finitely 
many elements of the covering. A refinement {V,;} of a covering {U,} of 
X is a covering such that each V, is contained in some U;. We also say 
that the covering {V,} is subordinated to the covering {U;}. 

A (continuous) partition of unity on X consists of an open covering 
{V,} of X and a family of real continuous functions 


WwW; X oR 
satisfying the following conditions. 


PU 1. For all x € X, we have w(x) 2 0. 
PU 2. The support of wW; is contained in V,. 
PU 3. The covering {V,} is locally finite. 
PU 4. For each point x € X, we have 


> w(x) = 1. 


(The sum is taken over all i, but is in fact finite for any given x in view 
of PU 3.) As a matter of notation, we often write that {(V,, W,)} or sim- 
ply {w,} is a partition of unity if it satisfies the previous four conditions. 
In the proof of the next theorem, we use the facts (trivially proved) 
that if a space X has a countable base, then any open covering has a 
countable subcovering, and any base contains a countable base. 


Theorem 5.3. Let X be locally compact Hausdorff, and assume that the 
topology of X has a countable base. Then X admits continuous parti- 
tions of unity, subordinated to a given open covering %. 


Proof. Let U,, U,,...,... be a base for the open sets, such that each 
U; is compact. We construct first inductively a sequence A,, A>, ... of 
compact sets whose union is X and such that A; is contained in the 
interior of A;,,. We let A, = U,. If we have constructed A, inductively, 
then we let j be the smallest integer such that A; is contained in 


U, UU, 


and we let A;,, be the compact set 


272 LOCALLY COMPACT SPACES [IX, $6] 


Let Int abbreviate interior. For each point x of A;,, — Int(A;) we can 
find a pair (W,, V,) of open sets containing x such that WcWc V,, 
such that V, is contained in Int(A;,,) — A;_,;, and such that V, is con- 
tained in one of the open sets of the given covering U. There is a finite 
number of pairs such that already the open sets W, cover the compact 
set A;,, — Int(A;). Taking all such finite collections of pairs for i = 1, 2, 
.., we obtain a countable collection of pairs {(W,, V,)} such that the 
{V,} form a locally finite covering of X, the {W,} is also an open cover- 
ing, and W,c V,. Let h, be such that W, <h, < V, (see the beginning of 
§2 for the notation <.) Let 


Let 
Wi, a h,/ h. 


Then {w,} is the desired partition of unity. 


Theorem 5.4. Let {h;} (i = 1, 2,...) be a countable partition of unity on 
X. Let p be a regular positive Borel measure on X, and let 


fe Lu). 


Then for each i, h,f is in Z*(p), and 


» { hf du = | fap 


in the sense that the sum is absolutely convergent, and is equal to the 
integral on the right. 


Proof. Let 


Then |f,| <|/|, and the sequence {f,} is pointwise convergent to f. We 
can therefore apply the dominated convergence theorem to conclude the 
proof. 


IX, §6. PRODUCT MEASURES ON LOCALLY 
COMPACT SPACES 


Let X, Y be locally compact Hausdorff spaces, and let py, v be positive 
c-regular Borel measures on X and Y, respectively. We let @(X) and 
B(Y) denote the o-algebras of Borel sets in X and Y, respectively. If X, 
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Y are o-finite with respect to these measures, then Fubini’s theorem 
applies. However, we warn the reader that in general, one does not have 


BX x Y) = BX) @ AY). 


(Even if X is compact, Y = X. Examples are obtained by taking X with 
abnormally many open sets.) However, we can still integrate functions in 
C(X x Y), as shown by the following results, which are nothing but 
corollaries of the Stone—Weierstrass theorem, expressed as lemmas. 


Lemma 6.1. Let X, Y be locally compact Hausdorff spaces. Every 
function in C.(X x Y) can be uniformly approximated by functions 
which are finite sums of type 


(x, y+), Ox) Wi(y), 
with o; € C,(X) and W; € C,(Y). 


Proof. We may restrict ourselves to the real case. We note that 
functions of the above type form an algebra A which separates points, 
and this algebra is such that if K is compact in X x Y, then there exists 
some g € A such that g is equal to 1 on K. (For instance, if C, D are the 
projections of K on X and Y, respectively, then K c C x D, and we can 
write 


g(x, y) = o(x)W(y) 


where g is 1 on C and wW is 1 on D.) We are therefore reduced to 
proving a second lemma. 


Lemma 6.2. Let X be locally compact Hausdorff, and let A be an 
algebra of real valued functions in C,(X), which separates points, and is 
such that if K is compact in X, then there exists «€ A which is 1 on K. 
Then A is dense in C,(X) for the sup norm. 


Proof. Let feC.(X) and let K be the compact support of f. Let 
aeA be 1 on K. Let U be an open set containing the support of a, 
and having compact closure U. The restrictions to U of elements of A 
form an algebra, which clearly satisfies the hypotheses of the Stone— 
Weierstrass theorem. Therefore the restriction f|U can be uniformly 
approximated by elements of A|U. Denote by || ||g the sup norm over 


U. If we can approximate f by an element Be A over U, say 


If -Blo<e, 


then 
laf — aBllg < ellall, 
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and thus we have a uniform approximation of af by af over U. But 
af = f, and «f is equal to 0 outside U. Thus the uniform approximation 
holds over all of X, as was to be shown. 


As a matter of notation, if g is a function on X and w is a function 
on Y, then we denote by g ® yw the function 


(x, y)> p(x) W(y), 


and call it the product function. The set of finite sums of product 
functions is an algebra, which we shall call the algebra generated by the 
product functions. 


Theorem 6.3. Let X, Y be locally compact Hausdorff spaces and let p, 
v be positive o-regular Borel measures on X and Y, respectively. Assume 
that X, Y are o-finite with respect to these measures. Then all functions 
in C(X x Y) are in #*(u@v), and there exists a unique o-regular 
Borel measure on X x Y which restricts to u®v on &(X)@ AY). 


Proof. Lemma 6.1 shows that functions in C,(X x Y) are (u@v)- 
measurable, and combined with Fubini’s theorem shows that these func- 
tions are in Z'(u@v). The map 


fro Fue) 


then is obviously a positive functional on C,(X x Y), and we can there- 
fore apply Theorem 2.3 to get a o-regular Borel measure having the 
desired properties. The Corollary 2.8 gives the uniqueness, thus proving 
our theorem. 


IX, §7. EXERCISES 


We assume throughout that X is locally compact Hausdorff. 


1. Let X be compact, and let C(X) be the algebra of real continuous functions 
on X. If A is a functional on C(X), such that (1) =|A|, show that A 1s 
positive. 


2. Assume that X is separable. Show that every open set is o-compact. 
3. Show that the complex regular Borel measures form a Banach space. 


4. Assume that X, Y are locally compact Hausdorff and o-compact. If u, v are 
regular Borel measures on X, Y respectively, show that u © v is regular. 
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5. 


10. 


11. 


. (a 


Let pw, v be regular Borel measures on R". Define the convolution y*v by 


(1 * v)(A) = (u@ v)(o (A), 


where a: R" x R" > R" is the sum. Show that pv is regular. 


. Assume that X is o-compact. Let py be a regular Borel measure on X. If A is 


measurable, show that there exists a closed set Bc A and an open set VDA 
such that u(V — B) <e. 


. Assume that every open set in X is o-compact. If v is a positive Borel 


measure which is finite on compact sets, show that v is regular. [Hint: Show 
that v = w if w is the regular measure associated with dv as in the text. Do it 
first for open sets. | 


. (a) Let M denote the Banach space of complex regular Borel measures on R’”. 


If m, m’ are in M, show that for f € C.(R") the integral 


| {7 (x + y) dm(x) dm’(y) 


exists, and defines a bounded functional on C,(R"), whose measure is 
denoted by mm’, and is called the convolution of m and m’. Prove that 
convolution of elements of M is associative, bilinear, commutative, and 
has a unit element. Thus M is a Banach algebra. 

(b) Let be Lebesgue measure and let fe #*(u,C). Show that for any 
meéM we have 


M* Up = Mg 
for some ge £'(u, C). In algebraic terminology, this means that the ab- 


solutely continuous elements of M (with respect to Lebesgue measure) 
form an ideal in M. 


wae” 


Let 4 be a bounded functional on C,(X), and let m be the regular complex 
Borel measure such that dm=4. Show that A extends uniquely to a 
functional on £'(|m|) by continuity, and that this follows at once from 
the remarks preceding Theorem 4.2. 

(b) Let {h;} (j= 1,2,...) be a countable partition of unity on X. Let 
fe L"(|m|). Show that 


¥. Ahi f) = ACh). 


[Note: This obvious extension of the text, and of Theorem 5.4 in particular, 
is useful when dealing with manifolds. Cf. for instance Chapter XXIII, §3, §5, 
and §6.] 


Verify in detail the “obvious” fact in the proof of Theorem 5.1 that A is a 
functional, in particular that for each compact set K there is a number A, 
such that for any f € C,(X) with support in K we have |Af| S Axl fll. 


Let yp be a regular positive measure on R. (a) Show that the functions of type 
e *g(x) (where g is a polynomial) are dense in #'(R*, p). (b) Show that the 
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~ x2 


functions of type e~*°g(x) (where g is a polynomial) are dense in #'(R, py). 
(c) Same thing for ¥? with 1 < p< oo. [Hint: Cf. Exercises 19 and 20 of 
Chapter III. ] 


12. Let u be a regular positive Borel measure on R. If fe #*(u) and 


| f(xje™ du(x) = 
R 


for all real t, show that f(x) =0 for p-almost all x. [Hint: By a Fourier 
series argument, show that 


| fgdu=0 


if g is C® of period 2N with large N, and then, also if g € C,(R).] 


13. Let u be a regular positive Borel measure on R. 
(a) Assume that there exists c > 0 such that the function xr is in £}(p). 
Let fe (un), 1<p<o. If f is orthogonal to all functions {x”} (n 2 0), 
1.€. 


| F(x)x" du(x) = 
R 


for all n 2 0, then f(x) = 0 for y-almost all x. 
(b) Let fe F'(p). If there exists c > 0 such that 


| f(x)x"e~! du(x) = 
R 


for all n = 0, then f(x) = 0 for p-almost all x. Note: Actually, (b) implies (a). 
[Hint for (a): Show that the integral in Exercise 12 is analytic in t for t at a 
distance < c/q from the real line, and 0 near the origin. You can also use 
the exercises at the end of Chapter III.] 

Examples. Taken du(x) = e~*’ dx. We get the completeness of the Her- 
mite polynomials. For the Laguerre polynomials, one takes du(x) = h(x) dx, 
where h(x) = 0 if x <0 and h(x) =e™* if x 20. And similarly for the other 
classical polynomials, which are obtained by applying the orthogonalization 
process to {x"}. 


14. Let X be compact and let w be a regular measure on X. Let A be a 
subset of X whose boundary has measure 0. Let {x,} be a sequence of points 
of X having the following property. For every continuous function f on X 
we have 


( lim * Sf) = | f dy. 


no i= 


Let N(A, n) be the number of indices i < n such that x,¢ A. Prove that 


lim wen = pA). 


b> 00 
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This is the equidistribution of the sequence {x,} in X. In some applications, 
instead of using condition (*) for all continuous functions f one uses it only 
for a vector space of more special functions, dense in C(X) (e.g. the space 
generated by characters on a compact group). 


15. (Karamata’s theorem.) Let » be a regular positive Borel measure on R™* such 
that the integral 
| e-* du(x) 
0 
converges for t > 0, and such that 
lim t’ | e ™ du(x) = C 
t—0 4) 


for some positive constants r and C. If f is a continuous function on (0, 1], 
then 


‘ r ” —tx\ ,—tx _ Cc ” —t\,r-1,-t 
lim t \ fle ™)e duls) = 6 | fle "te! dt. 


t-0 


[Hint: By Weierstrass’ approximation, it suffices to prove the theorem 
when f is a polynomial, and hence when f(x) =x*. This is done by direct 
computation. | 

For an application of Karamata’s theorem, see [BGV, p. 95]. 


16. Let uw be a finite positive regular Borel measure on R. Assume that 


| e* du(x)=0 forall neZ. 
R 


Prove that p = 0. 


CHAPTER X 


Riemann-stieltjes Integral 
and Measure 


This chapter gives an example of a measure which arises from a func- 
tional, defined essentially by generalizations of Riemann sums. We get 
here into special aspects of the real line, as distinguished from the general 
theory of integration on general spaces. 


X, §1. FUNCTIONS OF BOUNDED VARIATION 
AND THE STIELTJES INTEGRAL 


Let us start with a finite interval [a,b] on the real line. To each 
partition 


P=[a=Xo,X,;.--,X, = 5] 
we associate its size, 
o(P) = SIZE P = max (Xp41 — X;). 
k 


Let 
f: [a,b] ~E 


be a mapping of the interval into a Banach space. Let P be a partition 
of [a, b] as above. We define the variation V,(f) to be 


VolS) =" Ufless) — fe 
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We define the variation 


V(f) = sup VP(f), 


where the sup (least upper bound if it exists, otherwise 00) is taken over 
all partitions. If V(f) is finite, then f is called of bounded variation, and 
f is bounded. 


Examples. If f is real valued, increasing, and bounded on [a, b], then 
f is obviously of bounded variation, in fact bounded by f(b) — f(a). 

If f is differentiable on [a,b] and f’ is bounded, then f is of bounded 
variation (mean value theorem). This is so in particular if f is of class 
Cc)" 


The mappings of bounded variation form a vector space. In fact, if f 
and g are of bounded variation, then for « € C, 


Vif+g9SVP)+V(g) and = V(af) =|alV(f). 


In other words, denoting by BV([a, b]) the space of mappings of bounded 
variation, we see that the variation V is a seminorm on BV([a, b]). 

Let f,; g have values in Banach spaces E, F respectively, and suppose 
given a product (bilinear map) E x F >G into a Banach space, denoted 
(u, v)+ uv, and satisfying |uv| < |u||v|. If f| g are of bounded variation, 
so is fg, aS one verifies easily. A special case of a product occurs of 
course when E=C itself, and multiplication is just multiplication by 
scalars. One then obtains a bound for the variation of a product, namely 


Vi fg) S IFIV (9) + Ng lVYS). 


This estimate is an immediate consequence of estimating the sums for the 
variation and using the triangle inequality. 

The notation for the variation really should include the interval, and 
we should write 


V(f, a, b). 


V(x) = VF, a, »), 


so V; is now a function of x, called the variation function of /. 


Define 


Proposition 1.1. Let f €¢ BV({a, b]). 


(i) The function V, is increasing. 
(uu) Jf asxsyspb, then 


V(f, a, y) = Vif, a, x) + VF, x, y). 


(ii) If f is continuous, then V, is continuous. 
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Proof. For (i), we note that if x < y, then we can always refine a 
partition of [a, y] to include the number x. Furthermore, if P’ is a 
partition refining P, then 


Ve(f) S Ve-(f). 


Then (i) follows at once. For (ii), we again use the fact that a partition of 
[a, y] can be refined to contain x. Finally, suppose that f is continuous. 
By (ii), the continuity from the right of V,; amounts to proving that 


lim V(f, x, y) = 0. 


yrx 


Suppose that the limit is not 0. Then there exists a number 6 > 0 such 
that 


Vif,x,t)>0 for all x<tsy. 


Let x =X9 <x, <°*' <x, = y bea partition of [x, y] such that 


n-1 
» If (p41) — £(%,)| > 0. 


By the continuity of f at x, we can select y, very close to x and in 
particular x < y, <x, such that the inequality remains valid if we re- 
place the term 


fx) - FO)! by IF) — FOI. 


Thus we have proved: 


There exists y, with x < y, < y such that 


LAGE Vi,y)>0. 


Now we repeat this procedure with y replaced by y,, and find y, with 
x <y, < y, such that 


Vf, ¥2. V1) > 0. 
After N steps, we get 


Vf. Yn» Y) > No. 


Since V(f, y,,y) < V(x, y), this gives a contradiction, concluding the 
proof. 


Theorem 1.2. Let f be a real valued function on [a,b] of bounded 
variation. Then there exist increasing functions g, h on [a,b] such that 
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g(a) = h(a) = 0, and 
F(x) — f(a) = g(x) — hi), 
V(x) = g(x) + h(x). 


If f is continuous, so are g and h. 


Proof. Define g, h by the formulas 
2g =V, +f — fla) and 2h=V, —f + f(a). 


If f is continuous, so are g and h by Theorem 1.1 (iii). In any case, 
g(a) = h(a) = O and the two formulas of the theorem are valid. There 
remains only to prove that g, h are increasing. Leta<xsy<b. Then 
by additivity of Proposition 7.1 (1), 


2g(y) — 29(x) = V(f, x, y) + fly) — f(x) 2 9, 


SO g is increasing, and similarly h is increasing, thus concluding the proof. 


We now generalize the notion of Riemann integral to that of Stieltjes. 
Let f, g be bounded maps of [a, b] into Banach spaces E, F respectively, 
and assume given a product E x F >G denoted by (u,v) uv such that 
|juv| S |u||v|. Given a partition P = {a = x9, x,,...,X, = b} and numbers 
c, with x, Sc, S X,4,, we define the Riemann-Stieltjes sum (abbreviated 
RS-sum) as before, namely 


S(P. 6 f.9) = SIP.) = '¥. flex) Lalas) — 96%} 


Denote by oa(P) the size of the partition P. We say that the limit 


lim S(P, c) 


a(P)-0 


exists if there exists Le G such that given ¢ there exists 6 such that 
whenever o(P) < 6 then |S(P, c) — L| < «. If 


lim S(P,c, f, g) | 


o(P)-0 


exists, we say that f is RS(g)-integrable, and we denote the limit by the 
integral called the Riemann-Stieltjes integral 


| f dg. 


When g(x) = x, then the integral is just the Riemann integral, and we call 
the function Riemann integrable. 
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It is immediate that the set of RS(g)-integrable maps is a vector space, 
and that the RS-integral is a functional on this space. The space itself 
will also be denoted by RS(g), or RS(g, [a,b]) if we want to specify 
[a, b]. 

Let ax b<c. It is immediately verified that if f is in RS(g, [a, c]), 
then f is in RS(g,[a,b]) and RS(g,[b,c]) and we have the usual 


property 
c b c 
| f dg = | f dg + | Ff dg. 
a a b 


(Warning: The converse may not hold!) 

Observe also that the RS-integral is linear in g. In other words if 
fe RS(g,) and f €¢ RS(g,), then f € RS(g, + g2); if «eC then f € RS(ag); 
and we have linearity of the integral keeping f fixed, viewing the integral 
as a function of g, that is 


b b b b b 
[ rata. +on= | Sao | Faas, [fata = | fae. 


These are immediate, using the triangle inequality. 
We have the usual estimate with the sup norm. 


Proposition 1.3. Assume f € RS(g), and g of bounded variation. Then 


| fa 


where || f|| is the sup norm of f on [a, b). 


S If IV (9), 


Proof. Immediate by estimating an approximating RS-sum and using 
the triangle inequality. 


Proposition 1.4. We have f € RS(g) if and only if g € RS(/), and in that 
case, the formula for integration by parts holds, namely 


b b 
| fdg+ | g df = f(b)g(b) — fla)g(a). 


a a 


Proof. Suppose that {?g df exists. Consider a RS-sum which we un- 
wind using summation by parts: 


S(P,c,f,9) = ¥, flex)Lat%e+1) — 900)] 


= -> [f(ce) — f(Ce-1) 190%) + f(Cn1)9(0) — f(a)g(a) 
= f(b)g(b) — f(a)g(@) — SQ, 9, f) 
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where 


(0, 9.) = aL Fler) — fla] + Y, gle) Me) ~ Alera) 
— g(b)LF(b) = f(Cn-1)1 


Here we have written the product in reverse direction. 
Then S(Q, g, f) is a RS-sum with respect to the partition 


QO: [a = C9, C15 -++3Cn-1> 9] 


with the intermediate points a, x,, ...,X,-;, b. When the size of P 
approaches 0, so does the size of Q, and by hypothesis, the sum S(Q, g, f) 
approaches the integral {?gdf, thereby completing the proof of the 
proposition. 


Finally we give a criterion for RS-integrability. 


Proposition 1.5. Assume that f is continuous and g of bounded variation 
on [a,b]. Then f € RS(g). 


Proof. Given ¢ let 6 be such that if |x — y| <6 then |f(x) — f(y)| <«. 
Let P, P’ be partitions of size < 6. To estimate |S(P, c) — S(P’.c’)|, we 
may assume without loss of generality that P’ is a refinement of P. Thus 
it suffices to prove two estimates: if P’ = P but we change the choice of 
intermediate points c to c’, then the difference of the sums is small; and 
if P’ is obtained from P by inserting one more point in the partition, 
then again the difference of the sums is small. As to the first step, letting 
P = P’, we have 


|S(P, c) — S(P, ¢’)| Sb fled) — Fellas) — 9) S eV(9), 


which gives us the desired estimate. Secondly, suppose that P’ is ob- 
tained from P by inserting one point, say x; with x; <x; Sxj;4,. Then 
the size of P’ is still < 6. By the first step, to get the desired estimate for 
|S(P, c) — S(P', c')| we may assume without loss of generality that for 
i # j we have x; = x;, that c; = x;, and x; is also selected as the interme- 
diate point for the two intervals [x;,x;] and [x;, x;,,] of the partition 
P’. Then 


S(P, c) — S(P’, c’) = 0, 
and the proposition is proved. 


Often the RS-integral can be computed as a Riemann integral, as in 
the next proposition. 
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Proposition 1.6. Let f be continuous and suppose that g is real differen- 
tiable on [a, b| with Riemann-integrable g'. Then f € RS(g), and 


b b 
| fdg = | F(x)g"(x) dx, 


where the integral on the left is the RS-integral, and the integral on the 
right is the usual Riemann integral. 


Proof. This is immediate by using the mean value theorem on g and 
estimating the RS-sum by the triangle inequality and the hypothesis that 
g' is Riemann integrable. 


Finally we consider the special case when f, g are complex valued or 
even real valued. Suppose g is of bounded variation. Then on C([a, b]) 
we obtain a bounded functional 


fre f dg. 


By Proposition 1.1, and the linearity of the integral in f and g, we may 
decompose the integral into a sum of four terms, each term involving 
only real valued functions. Furthermore, if f € C,(R), then the support of 
f lies in some bounded interval [a, b] and we may define 


| fao=| f dg, 
R a 


the right side being independent of the choice of [a,b]. Suppose that 
there exists a number B > 0 such that 


V(f,a,b)<B forall [a,b], 


so the variations are uniformly bounded on finite intervals. In this latter 
case, the space of such functions is denoted by BV(R), and called the 
space of functions of bounded variation on R. On this space, we define 


Va(f) = sup V(f, a, ). 


Then Vp(f) is a seminorm on BV(R), and results of this section extend to 
BV(R). In particular, the estimate of Proposition 1.2 holds over R, that 


1S 
je 


S IIfllVa(g) 
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for fe C.(R) and ge BV(R). By the general representation theorem, there 
exists a unique complex valued measure yp, such that 


| fdg= | f du, forall feC,(R). 
R R 


The variation Vp(g) simply provides an explicit concrete example of 
the general notion of the norm ||y,|| of the measure. We call py, the 
Riemann-—Stieltjes measure associated with g. 

If h is a bounded increasing function, then yp, is a finite positive 
measure on R. It is immediate from the proof of Theorem 1.2 that the 
decomposition of a real valued function g of bounded variation on a 
finite closed interval is also valid on R, by the same formula using the 
total variation. Then the decomposition of Theorem 1.2 gives an exam- 
ple when a real valued measure is expressed as a difference of two 
positive measures. 

In some applications of the next section, we shall mix conditions when 
a function is L'(R) and also conditions of bounded variation. The fol- 
lowing observation is sometimes useful. 


Lemma 1.7. Suppose f ¢ L'(R) 0 BV(R). Then f vanishes at infinity, in 
the sense that 


lim f(x)=0 and lim f(x)=0. 


x 0O x7 — 


Indeed, suppose f(x) 2c > 0 for infinitely many x. Then f(x) 2 c/2 (say) 
for all x sufficiently large, otherwise f would not be in L'(R)7 BV(R). 
Under these circumstances, in a situation when integration by parts is 
valid for finite intervals as in Proposition 1.4, the extra terms 


F(x) g(x) © = Fim Lf(b)9(b) — fla)g(a)] 


—-@ a,b>0 


vanish. 


Proposition 1.8. 


(a) Let f be of bounded variation. Then the set of points of discontinu- 
ity of f is countable. 
(b) Assume f continuous at a and b witha <b. Then 


b 


f(b) — f(a) = | duy(x). 


a 


In particular, the set consisting of the single point a has u,-measure 0. 


286 RIEMANN-STIELTJES INTEGRAL AND MEASURE [X, §1] 


Proof. We leave part (a) as an easy exercise (see Exercise 4), done by 
a routine estimate. As for part (b), we note that the constant function 1 
is continuous on [a,b], and all Riemann-Stieltjes sums give the same 
value f(b) — f(a), as desired. 


I have given above enough results on functions of bounded variations 
to indicate the flavor of the theory, and to suffice for some applications: 
to the spectral theorem of Chapter XX, §5, and to Fourier analysis in the 
next section. For more on functions of bounded variations, see Natanson 
[ Nat]. 

We end this section with a useful mean value theorem, which did not 
fit anywhere else, and illustrates once more the technique of summation 
by parts. 


Proposition 1.9 (Integral Mean Value Theorem (Bonnet, 1849)). Let f 
and g be Riemann integrable on [a,b]. Assume that f is positive in- 
creasing. Then there exists c € [a, b| such that 


b b 
| f(x)g(x) dx = fib) | g(x) dx. 


Proof. Let P = [a=xX 9, X,,..-,X,] be a partition of [a,b] such that 
the Riemann sum 


S= > fxcg(x,)(%;— X;,) approximates ) ” Aoeyglx) dx. 
Let a; = f(x;) and b; = g(x;)(x; — x;_,). Summing by parts, and letting 
B= Y. gl%s)(%4 — X4-1) 
we find (putting B,,, = 0) that 
S = fo)B, + Y BUS) ~ Fle) 


Since f(x,) = f(b), we find 


f(b) min B, < S S f(b) max B,. 
k k 


But B, is a Riemann sum itself for g over the interval [x,_,, x, ] = [x,_1, b]. 
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Varying x,_,, we see that given ¢, we obtain 
b 


inf f(5) [ g(x) dx — 2€S [ f(x)g(x) dx S sup f(b) | g(x) dx + 2¢, 


c 


whence the same inequality without the ¢«. But 


b 
o | g(t) dt 


is continuous, so by the intermediate value theorem for continuous func- 
tions, the proposition follows. 


X, §2. APPLICATIONS TO FOURIER ANALYSIS 


In this section we give applications of integration to a more refined 
Fourier analysis, especially using the results on functions of bounded 
variation from §1. We start however with the simplest situation contain- 
ing no delicate estimates. The next result is a routine version of the 
Riemann—Lebesgue Lemma. 


Proposition 2.1. Let f eC2°(R). Given a positive integer k, there exists 
a constant C = C(f, k) such that for A = 1 we have 


Sk 


[" f(x)e'4* dx 


Proof. We integrate by parts k times, so that a power A* comes into 
the denominator. The constant C is the sup norm of the k-th derivative 
of f. Since f is assumed to have compact support, the term uv in the 
integration by parts disappears since f(x) =0 for x large positive and 
large negative. So the proposition is clear. 


In particular, the proposition gives a quantitative estimate for the limit 


lim | f(x)e'4* dx = 0. 


Ao 


We may consider next the variation when we must take an end point 
into consideration. 

We now introduce the condition of bounded variation. According 
to Zygmund [Zy], Dirichlet was the first who proved Fourier series 
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convergence theorems under the conditions of normalization and of being 
increasing, which amounts to bounded variation. See Zygmund’s com- 
ments, Chapter II, 2.6. 

We recall that for a function fe L'(R), the Fourier transform f* is 


defined by 
1 ° . 
f(y) == | f(x)e"*” dx. 
a/ 21 J —x& 
The function f* is obviously bounded, by ||/||,. 


Proposition 2.2. Let f¢€ BV(R)OL'(R). Then there is a constant C = 
C(Sf) such that for all t #0 we have 


ro = oS | . fooe-™* da 


Proof. We integrate by parts, obtaining for the integral the value 


C 


= 
[e| 


~ f(x) — [ + " e~"* df (x). 


itt |_, ttJ-» 


Since f¢€BV(R)7L'(R), we remarked in Lemma 1.7 that f(x) 0 as 
x — +00, so the first term vanishes. The integral of the second term is 
finite since f has bound variation, so the estimate < C/|t| is evident. 


Proposition 2.3. Let f ¢ L'(R). Then 


lim | f(x)e'4* dx = 0. 


A> 


Proof. Let g € C°(R) be such that || f — g||, < ¢. Then the integral can 
be written and estimated in the form 


|  (s(8) — aon)e'** dx + | ° 


g(x)e'4* dx. 


The first integral is bounded in absolute value by ¢. The second integral 
is < ¢ for large A by Proposition 2.1. This proves the proposition. 


Next we look at Fourier inversion under more delicate conditions 


than those of the Schwartz space. 
Let A > 0, and let fe L'(R). We define 


fal) = sy | ‘s “(ne™* dt. 


We are interested in conditions under which we have f** = f~, in the 
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sense that 


f**(x) = him Sal). 
We begin by a lemma which gives another expression for fy. 


Lemma 2.4. Let f ¢ L*(R) and let A>0. Then 


— 1 (4 a ix gp 1 [* 4, sin A(x — y) 
falx) = Jan i, f*(He™ dt = - \ SY, 


If f is also in BV(R), then f, is bounded independently of A. 


Proof. We have 
A A oe) 
Vix f(te™ dt = | | f(yeieti= dy dt 
—A -AJ-o@ 
20 A 
Ss | f(y) i, e"=-Y) d¢dy [by Fubini’s theorem] 


=|" ofl yey, 


which proves the formula. The boundedness of f, if f ¢ BV(R) is left as 
Exercise 1 (Stieltjes integration by parts). 


The next theorem gives conditions for f** = f~. It involves an appli- 
cation of the Bonnet mean value theorem. The theorem is in Titchmarsh 
[Ti], who attributes the result to Prasad, Pringsheim, and Hobson. 


Theorem 2.5. Let f:R —-R be a function satisfying: 


(a) f is of bounded variation on every finite closed interval. 
(b) f is in L1(R). 
(c) f is normalized so that for all x ER, 


F(x) = 3(f(x +) + f(x—)]. 
Then 


sn A Yay = him fun) 
—y ave 


. i 
f(x) = lim a. F(y) 
A7o v6 
Proof. After a translation we may assume that x = 0. Since (sin Ay)/y 


is an even function of y, we may assume that f is even, and we are 
reduced to proving 


sin ay. 


5f(0) = tim |” F(y) 
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under the additional assumption that f is continuous at 0. We shall now 
see that the result depends only on the local behavior of f near 0. Pick 
some b>0. Then 


[0 6) ‘ A 
lim | fv) — dy =0 
b 


A> oo 


by Proposition 2.3. Therefore we are reduced to consider the integral in 
the finite interval [0, b], where f is bounded. By additivity, we may also 
assume without loss of generality that f is increasing, since a function of 
bounded variation is the difference of two increasing functions. 

We observe that the theorem is valid when f is constant on [0, b]. 
Indeed, in that case, we note from the change of variables u = Ay, 
du = Ady, that 


[Ay ee a dow 
—_ . 


Since the integral is linear in f, we may assume without loss of generality 
that f(0) = 0. Given «¢ there exists 6 such that for 0 < y < 6 we have 


f(y) — FO) = [FO < e. 


b : rs) b ° A 
| fy)? ay = | 4 | fy) ay 
0 y 0 F) y 


The second integral on the right approaches 0 as A > 00 by Proposition 
2.3 because (sin Ay)/y is bounded for y206. So what matters is the 
integral over [0,6]. By the integral mean value theorem, we now find 


r,) : 6.: Aé .: 
[10 ay = 10) | am AY ay = 78)| <a 
O y c y Ac u 


Then 


Since | f(6)| < ¢, and since the integral over [Ac, Ad] is bounded, we have 
concluded the proof of the theorem. 


In the next theorem, we deal still with another situation, when we do 
not assume an L'(R)-condition, but deal with oscillatory integrals to 
guarantee convergence in a quantitative form of the Riemann—Lebesgue 
Lemma. The result concerns a situation intermediate between Proposi- 
tion 2.2 and Proposition 2.3, and in the precise given form is due to 
Barner [Ba 90]. 
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Theorem 2.6. Assume: 


(a) f €¢ BV(R). 
(b) f(x) = O(|x|*) for some ¢ > 0 and x > 0. 


Then the improper integral that follows exists for A > 0, and satisfies 
the bound 


| fie” =0,(A CO?) for A> oo. 
0) 


Proof. We can split e’4” = cos(Ay) + isin(Ay) and consider separately 
the cosine and sine. The cosine is harder if anything because one has to 
rely on assumption (b) on f to make the integrand integrable near 0. 
Hence we deal with the cosine for concreteness to fix ideas. We shall 
first prove that the integral exists (the problem is at infinity), and then we 
shall prove that it satisfies the stated bound as A > oo. 


Existence. There will be some integrations by part, so we need the 
two auxiliary functions defined respectively for x > 0 and all xe R: 


io @) t ie. 8) + t 
ci(x) = — | ——dt and si(x) = — | ~—— dt. 


x 


Lemma 2.7. We have the estimates for x > 0: 
2 2 
jci(x)| S- and — |si(x)| S—. 
x x 


Proof. This is immediate after integrating by parts. 


To prove the existence of the improper integral in the theorem, we 
define 


~ cos At 
t 


__ cos Ay 
y 


g(y) dt = ci(Ay) 


and G(y) = -| 


y 


for y>0. Let 6>0. By Proposition 1.4 justifying integration by parts, 
for 0<a< ©, we get: 


| ; f(v) dG(y) = fla)G(a) — f(5)G(6) — | ; G(y) af(y). 


We now estimate each one of the three terms on the right side. 
Since f is bounded by hypothesis, and G(a) 0 as a-— oo, we see that 
f(a)G(a) > 0 as a— oo, so the first term approaches 0 as a- oo. 
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We claim that f(6)G(6) 0 as 6-0. To see this, we split the inte- 


gral: 
: t °° t 
ctx) = —| eo8 tat | ——dt 


1 


* t—l * dt 
a areca 
1 t 1 ' 


= convergent integral(x) + log x + constant as x > 0. 


Let x = Ay so log x = log A + log y. Consider f(x) ci(Ax) = f(x)G(x). The 
hypothesis that f(x) = O(|x|*) immediately implies that f(x) ci(Ax) 0 as 
x +0, thus proving the claim that f(6)G(6) > 0 as 6 > 0. 

Now to the third term. The limit 


lim | " f(y) dG) 
67-0 Jé 


exists by the hypothesis on f, so the integral 


\. G(y) f(y) 


exists. There remains only to prove that the tail end 


b b 
| G(y) df(y) = | ci(Ay) df(y) 


a 


approaches 0 as a> oo and a<b. But the routine estimate of Proposi- 
tion 1.3 gives for the absolute value of the integral a bound V(/)2/Aa, 
using also Lemma 2.7. Hence 


lim | G(y) df(y) exists. 
ao JO 


This concludes the proof of existence for the improper integral of Theo- 
rem 2.6. 


Estimate of the integral. We shall now prove the stated estimate. We 
let 1 = 1/1 + €). We decompose the integral: 


% cos A 1/A* % cos A 
| f(y) Y dy -| + f(y) Y dy 
) y ) 1/A4 y 


1/A4 ice) 
= \ + Lf(y) ci(Ay) ] S42 - | va ci(Ay) df(y) 
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= ue A cos(4y) dy — f(1/A*) ci(A**) 


0 


— | _oi(Ay) df(y). 


We shall estimate each one of the three terms on the right side of the 
equality. 
For the first term, the integral is taken near zero where 


If(/y| = Oy") 


and cos Ay is bounded. Hence the first integral in absolute value is 


1 1 


«<—Fam for A->o, 


which is of the desired type. 
For the second expression, using Lemma 2.7, we have the bound 


aie 1 2 


which is also of the desired type. 
For the third expression, we have the usual sup bound for the integral, 


2 
< a Vall) 


| ci(Ay) df(y) 


1/A4 


for every b, and hence for the integral to oo. This concludes the proof of 
the theorem. 


Remark. The previous theorem is similar to many of the same kind, 
giving various conditions under which the Fourier transform tends to 0, 
which is a form of the Riemann—Lebesgue Lemma. The method of proof 
also follows a standard pattern, whereby one splits the integral over 
various intervals, near 0 and near o0, with a cut-off point chosen judi- 
ciously. In determining the cut-off point, one may think of f(x) as actu- 
ally being equal to x*, and one first determines the cut-off as 1/A*, or a 
small perturbation thereof. I don’t know of a single theorem which covers 
all examples of that kind. Barner’s conditions have shown to be useful in 
the applications he had in mind, and were motivated by those applications. 


For a Parseval formula in a context similar to that of the present 
section, see [JoL]. 
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X, §3. EXERCISES 


1. Let f € BV(R) and also fe L'(R). Define 


s(x y= [et sin t 


Show that for A > 0: 


20 in A _ 
| / f(y ay < 2SVa(), 
or better, 
Mal 5 vals) 


2. Let {f,} be a sequence of continuous functions on [a, b], converging uniformly 
to f. Let g be of bounded variation on [a, b]. Prove that 


b b 
im | fda = | f dg. 


3. Let f be continuous on [a,b] and let {g,} be a sequence of functions of 
bounded variation, such that the variations are uniformly bounded, that 1s 
there exists a constant B such that 


V(g,, La, b]) S B. 


Assume that g,(x) converges to g(x) for some bounded function g and all 
x €[a, b]. Show that 


b b 
im | fda, = | f dg. 


4. Let f be of bounded variation on [a, b]. For each x € [a, b] define the jump 


Jp(x) = lim sup| f(y) — f)| 


where the sup is taken for ye [x —r,x +r]. Prove: 

(a) Given ¢, there is only a finite number of points xe[a,b] such that 
J,(x) 2 €. 

(b) The set of points where f is not continuous is countable. [Show that f is 
continuous at x if and only if J,(x) = 0.] 


5. Let f be a real valued function on [a, b]. If the set of discontinuities of f has 
measure 0, show that f is Riemann-integrable. The converse is also true, and 
you can prove it as an exercise if you are sufficiently interested. 


CHAPTER XI 


Distributions 


In Chapter IX, we saw how certain functionals on C,(X) gave rise to a 
measure. Here we consider the case when X = R” and the functionals 
satisfy additional continuity conditions with respect to differentiation. 


XI, §1. DEFINITION AND EXAMPLES 


We let D, = 0/0x; be the i-th partial derivative applied to functions on R”. 
For a p-tuple (p,,...,p,) = p of integers 2 0, we let 


D? = DP: --- DPn 


and |p| =p, +:::+p,. Then each differential operator D? operates on 
functions in C%(R"). Actually, we shall deal with the subspace of func- 
tions C;°(R") which have compact support. If f is a function, we let M, 
be the operator which consists in multiplying by f, so that we have 
M,(g) = fg, for any function g. We have the formula 


D,°0 M,; = M;° D; + Mp, 5 
for any f € C*(R"). We shall consider general differential operators 


D= ¥ a,D? 


\plsm 


with coefficients «, ¢ C°(R"). Because of the preceding formula, we see 
that such differential operators, viewed as linear maps on C>°(R”), form 
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an algebra under composition. The differential operator being written as 
above, we say that it has order < m. It is easy to verify that its expres- 
sion as above determines the coefficients a, uniquely. Indeed, suppose 
that D=0. To prove that a, = 0 it suffices to prove that for any a € R" 
we have a,(a)=0. For a given p we consider a function given locally 
near a by 


f(x) = (x — a)? = (x, — a, )P* ++ (x, — @,)?P". 
Then 
if 
D*f(a) = ‘ if p #4, 
p! if p=4q, 
where p! = p,!--: p,!. Hence (Df)(a) = p! «,(a) = 0, whence 
a (a) = 0. 


We define seminorms on C?(R") as follows. For each differential 
operator D and f € C°(R") we let 


tp(f) = Df 


where || || is the sup norm, and for each integer m 2 0 we let 


Tn( f) = sup || D? fl. 


p\sm 


It is clear that zp and z,, are seminorms on C°(R"). 

For any subset K of R" we denote by C>(K) the space of those 
functions in C2(R") whose support lies in K. We define a distribution on 
an open set U of R" to be a linear map 


T: C2(U) 3 C 


such that, for every compact set K contained in U, there exists a con- 
stant A, and an integer m for which 


|Tp| S Ax7,(¢), all p e C°(K). 
Just as it is useful to have a criterion for continuity of a linear map in 
terms of sequences when dealing with normed vector spaces, we have a 


similar criterion under the present circumstances, namely: 


Theorem 1.1. A linear map T: C?(U)—C is a distribution if and only 
if it satisfies the following property: 
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Let {9} be a sequence in C?(U), such that all p; have support in a 
compact set K, and such that for every p, {D?q,} converges to 0 uni- 
formly on K. Then To, 0. 


Proof. It is clear that if T is a distribution, then it satisfies the stated 
property. Conversely, assume that it satisfies this property, and let K be 
a compact subset of U. For each integer m 2 0 let 


a, = sup| Tf]. 


the sup being taken for those fe¢C2(K) such that z,(f)<1. It will 
suffice to show that for some m, we have a,, # 00. Suppose that a,, = oo 
for all m. Choose f,,¢ C°(K) such that z,,(f,,) <1, but |T7f,,| 2m. Let 
Im = Sm/m. Then 

Tm(Gm) = 1/m 
and if k < m, then 

T(Gm) Ss Tem(Jm) = 1/m, 

SO 

lim 2,(g,) = 0. 


moo 


But 
|D?’gmll S Tp(Gm) 


tends to 0 as moo, and g,, has support in K. Thus D’g,, tends to 
0 uniformly on K, and Tg,,20 by hypothesis, a contradiction which 
proves the theorem. 


We say that a distribution T on an open set U is of order < m if for 
each compact set K c U there exists a number A, such that 


| TQ| S AKT n(P) 


for all pe C(K). A distribution of order 0 is therefore a C,-functional 
as in Chapter IX, §1 and the Remark at the end of §4. 


We shall now give examples of distributions. 
Functions. Let f be a locally integrable function on an open set U of 
R". (We recall that this means: f is u-measurable for Lebesgue measure 


pu, and f is integrable on every compact subset K of U.) We associate 
with f the map JT, whose value on o € C>°(U) is given by 


T;(Q) = \, f(x) p(x) dx = |, fo du. 


298 DISTRIBUTIONS [ XI, §1] 


Then it is clear that TJ; is a distribution of order 0 on each compact 
subset K of U. In fact, if we use the obvious notation 


Flax = | |f | du, 


then 
IT-()| S WF llaxllell if pe C(K). 


Furthermore, the map ft>T;, induces an injective linear map of L*(U) 
into the space of distributions on U, because we know from Corollary 
9.5 of Chapter VI that if J; = T, for two locally integrable functions f, g, 
then f is equal to g almost everywhere. Thus from now on, we can 
interpret locally integrable functions as distributions. 


Measures. Similarly, let 4 be a positive o-regular Borel measure on 
the open set U of R". We know that dy is a functional on C,(U), and 
since C>(U) is a subspace of C.(U), we can view dy as a linear map on 
C~(U). Thus if @ has support in K, we have 


Kg, du>| S w(K) |lell, 


and we see again that dy is a distribution of order 0 on every compact 
subset of U. We could use the notation T), instead of dy for the preced- 
ing distribution. As with functions, we have an injective map ptr>dyu 
from the set of positive o-regular Borel measures on U into the set of 
distributions. 


The Dirac distribution 6 given by 


o(~) = 9(0) 


is a special case of a distribution obtained from a measure, namely the 
Dirac measure. For each ae R", we also have the translate 6, of 6, given 
by 


0,(~) = ~(a). 


Multiplication by a C® Function. Let T be a distribution on U and let 
aeC*(U). We define «T to be To M,, so that 


(aT)(p) = T(x). 
It is immediately verified that «aT is a distribution. 


Composition with Differential Operators. Let T be a distribution and 
D a differential operator on U. Then To D (also written TD) is a distri- 
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bution U. Its value at @ is 


(TD)(g) = T(D@). 


In particular, if T can be represented by a locally integrable function f/f; 
then 


T;D(9) = | ; JI(x)De(x) dx. 


The verification that TD is a distribution is immediate from the 
definitions. 


XI, §2. SUPPORT AND LOCALIZATION 


Let D be a differential operator on an open set U. For every open 
subset V of U, we can view D as a differential operator on V, namely 
considering the restriction of D to those functions having support in V. 
We say that D is equal to zero on U if Dp = 0 for every pe C*(U). We 
say that D is locally zero at a point ae U if D is equal to zero on some 
open neighborhood of a, ie. if there exists an open subset V of U, 
containing a, such that Do = 0 for every pe C*(V). We define the sup- 
port of D by describing its complement, namely: 


a ¢supp(D) if and only if D is locally zero at a. 


Note that if D is locally zero at a, then D is locally zero at every point 
close to a, so that the support of D is closed in U. 

Let T be a distribution on an open set U. We say that T is zero on 
U if Tp = 0 for each ge CY(U). If T is a distribution on U, and V 1s an 
open subset of U, then we can restrict T from C2(U) to C2(V), and this 
restriction, denoted by T|V, is a distribution on V. We shall say that a 
distribution T on U is locally zero at a point ae U if there exists an 
open neighborhood V of a in U such that the restriction of T to V is 
zero on V. We can thus define the support of T by the condition: 


a¢ésupp(T) if and only if T is locally 0 at a. 


As before, we see that the support of T is closed in U. 

We can localize distributions just as we localized measures, by means 
of partitions of unity, which must now be taken to be of class C®, not 
merely continuous. We restate this as a separate result. 


C® Partitions of Unity. Let K be a compact set in R" and let {U,} 
(j =1,...,m) be an open covering of K. Then there exist functions 9; 
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in C2(U;) such that », 2 0, 


m 


5 9S! and > 9, =1onKk. 


j=1 j=1 


Proof. For each xé€K we can find an open ball centered at x, of 
radius r(x), such that the ball of twice this radius centered at x is 
contained in some U;. We cover K by a finite number of such balls, say 
B,, ...,B,. For each k=1, ...,s we find a function wy, which is C™, 
which is equal to 1 on B,, 0 < y, <1, and such that w, vanishes outside 
a ball B, centered at the same point as B, and having a slightly bigger 
radius. This is done by routine calculus technique, cf. Chapter VI, §9. 
[We recall briefly below how to do this.] Inductively, one then sees that 
if we let 


mW = Wi; t= y(1 _ Wi), ogy = w(1 _ Wi) i (J ~ W.-1), 


then on B, U-::U B, we have 


Oto +a,=1—(1—y,)::: (1 —y,). 


This yields what we want, except for the fact that the indices may not be 
j=1,...,m. But it is trivial to adjust this as desired. All we have to do 
is to find for each k and index j(k) such that B, is contained in U;, and 
then for each j = 1, ...,m take the sum of those a, such that j(k) =j, to 
obtain @;. 

To get the function w, as in the preceding proof, we combine a 
function whose graph is indicated below with the square of the euclidean 
norm to get a C® function which is 1 on a ball, and 0 outside another 
ball of slightly bigger radius. 


If the ball is not centered at the origin, we combine this with a 
translation. 


Theorem 2.1. Let T be a distribution on an open set U in R". If T is 
locally zero at every point, then T = 0 on U. 
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Proof. Let @ € CY(U), and let K be the support of g. For each ae K 
we can find an open set U, such that T is zero on U,. We can cover K 
with a finite number of such open sets, say U,, ...,U,,. Let {;} be a C® 
partition of unity over K with j = 1, ...,m, such that supp @, is contained 
in U;. Then 


m m 


p=) oo and Tye=Y) T(9,9) =9, 


j=l j=l 
thus proving our assertion. 


Corollary 2.2. Two distributions which are locally equal everywhere are 
equal. 


Corollary 2.3. Let T be a distribution on the open set U, and let 
oeCr(U). If supp(T)o supp(@) is empty, then To = 0. 


Proof. Let K = supp g, and Q = supp T. There exists an open neigh- 
borhood V of K which does not intersect Q and is contained in U. Let 
a €C%(V) be such that « = 1 on K and the support of « is contained in 
V. Then @ = ag and 


T(o) = T(«@). 


It is immediately verified that «T is locally zero everywhere, and hence 
that «T = 0, so that To = 0, as was to be proved. 


Corollary 2.4. Let T be a distribution on an open set U, and assume 
that T has compact support K. Let o, weEC%(U) and assume that 
~ = w onan open neighborhood of K. Then To = Ty. 


Proof. Since @ — w is equal to 0 on an open neighborhood of K, it 
follows that the supports of @ —wW and T are disjoint, whence we can 
apply Corollary 2.3 to conclude the proof. 


As an application of Corollary 2.4, we can extend the domain of 
definition of a distribution T with compact support K to the whole space 
C~(U). Indeed, if « is a function in C°(U) which is equal to 1 on an 
open neighborhood of K, and if f e C°(U), then we define 


Tf = T(af). 


The preceding corollary shows that this value is independent of the 
choice of « subject to the condition that « = 1 on an open neighborhood 
of K, and that it is an extension of T,; namely T(ap) = T(@) if 9 € C>(U). 
This extension is useful in a context like the following. Consider the 
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function f such that f(x) =x (say in one variable x). If T has compact 
support, then we can speak of the value T(x) = Tf(x) using the definition 
we just made. 

Using partitions of unity over a whole open set, one can prove the 
following result, left as an exercise. 


Let {U;} be an open covering of an open set U in R". For each i, let T; 
be a distribution on U,, and assume that for each pair i, j the restric- 
tions of T; and T; to U;\U, are equal. Then there exists a unique 
distribution T on U which is equal to T; on each U;. 


We give one more example of the localization principle, using parti- 
tions of unity over a compact set. We say that a distribution T is locally 
of order < m at a point a if there exists a compact neighborhood K of a 
such that T is of order < mon K. 


Theorem 2.5. If a distribution T on U is locally of order S m at every 
point of U, then T is of order < m on every compact subset of U. 


Proof. Let K be a compact subset of U. Let {a,;} (j =1,...,k) be a 
C® partition of unity over K, such that each a, has support in an open 
set U; whose closure U; is compact, contained in U, and such that T is of 
order < m on this closure. For any g € C°(K) we have 


k 
T(9) = » T(a;9). 
jJ=- 
We note that supp(a;~) < U,; < U;. Let A; be a number such that 


ITP <Ajt(f), all f ¢ C?(U)). 
Then 


k 
IT(9)| Ss X A Tmn(0;). 
j= 
But 
D?(a;@) = y Wj,D4o 
lal S |p| 


with suitable functions y,, determined by «; and p. Thus 


ID'(a,o)| SY) Ivjqll |D%¢ll. 
lal S101 


Hence there is a constant B; such that 


Tm(%;P) S Bry), all oe C.°(K), 
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whence 


k 
IT(p)| S$ X AjByr,(g), all p € C°(K). 
= 
This proves our theorem. 


XI, §3. DERIVATION OF DISTRIBUTIONS 


Theorem 3.1. Let D be a differential operator on the open set U of R". 
Then there exists a unique differential operator D* such that for any 
functions f € C°(U) and go € C2(U) we have 


| 100 du = | one du. 


Proof. For the existence, we may restrict ourselves to the case when 
D = «aD? for some «e€ C*(U). Then we have 


D* =(— 1)'DP oM,, 
because integration by parts shows that 


| 109 = | are =(—1)" | Drane. 


This proves the existence. As for uniqueness, suppose that D* and D’ are 
differential operators such that 


| (D'f)o = | (D*f)@ 
for all fe C°(U) and oe CY(U). Then 
| — D')f)p =0 


for all w, so that (D* — D’)f = 0, whence D* = D’. 


We shall call D* the adjoint of D. The map D++D* is an anti- 
automorphism of the ring of differential operators; anti because 


(D,D,)* = D} DF. 
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Let T be a distribution on U and D a differential operator. We define 
DT = To D* = TD* 


on C%(U). In particular, if D =D, is the i-th partial derivative, then 


D* = —D, and 
7) 
(D,T)(9) = -1(22), 
Xj 


The reason for our definition is that if f is a C’ function, then 
recalling that T,(@~) = | fo du, we have the formula 


DT, = Tp, 
as one sees from Theorem 3.1. 


Example. Let f be the locally integrable function on R, such that 
f(x) =1 if x20 and f(x)=0 if x <0 (this is sometimes called the 
Heaviside function). A trivial integration in one variable shows that the 
derivative of T; is simply the Dirac distribution, i.e. we have 


DT, = 6, 
where D = D, is the derivative in one variable. 


Example. Consider the Laplace operator 


Let 


1 
g(x, y) = 5, 108 r 
7 


where r = (x? + y*)'/? as usual. Then T, is a distribution on the plane, 
and 
AT, =6 


is the Dirac distribution at the origin. See formula L3 of Chapter XIX, 
§3. 


XI, §4. DISTRIBUTIONS WITH DISCRETE SUPPORT 


To investigate the structure of distributions with discrete support, it suf- 
fices to describe distributions whose support is one point, and then by 
translation, distributions whose support is at the origin. We can then 
give a complete description of such distributions. 
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Theorem 4.1. Let T be a distribution whose support is {0}. Then there 
exists an integer m 2 OQ and constants c, such that 


T= ) c,D?6. 


|p|sm 


In fact, c, = (—1)"'T(x?)/p!. 


Proof. First we recall from differential calculus that if U is an open 
set containing 0, and if fe C°(U) and D’f(0) = 0 for |p| < k, with k = 1, 
then there exist C° functions f, such that 


I(x) = | de xPf,(X). 


|=k 


This is proved by starting with the formula 
1 1 
F(x) = f(x) — f(0) = | f'(tx)x dt = | f'(tx) dt-x, 
0 0 


and continuing to integrate similarly the successive derivatives of f. We 
then write the Taylor expansion of f, namely 


Pf£)(0 
fia) ~ YDS 


xP. 


Since T has compact support, it has order < m for some m. We shall 
use the definition of T on functions as in the discussion following Corol- 
lary 2.4. 

We consider the Taylor expansion of f up to the terms of order m, 
and consider the function 


xP. 


(1) g(x) = f(x) — 


\p|sSm 


(D?f)(0) 
p ! 


Then (D%g)(0) = 0 if |g| < m, and our preceding remark allows us to write 
g aS a sum of terms each of which is of type 


x"hy(x) 


with |k| 2=m+1. We shall prove below that T(x“h)=0 if h is C® in 
some neighborhood of 0, T has order < m, and |k| 2m+41. Once we 
have this, we conclude that Tg = 0, whence from (1), we obtain 


r= ¥ PAO 


or Pe 


lpism 


from which our theorem follows at once. 
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We now prove that T(x*h) = 0 under the stated conditions. Let « be 
a C® function with support in the unit disc, and equal to 1 on some 
neighborhood of 0. Let «,(x) = a(rx), so that the support of a, shrinks to 
the origin as r > oo, and in fact lies in the disc of radius 1/r. Fix q such 
that |q|) =m+1. We have for r > 0: 


T(x4a,h) = T(x*h), 


and it will suffice to prove that this value tends to 0 as roo. Since m 
is the order of T, there exists a constant A such that 


| T(x4a,h)| S Ar,,(x40,h). 
Thus we have to estimate 
|| D?(x4a,h)||, |p| Sm. 


The support of x4a,h lies in the disc of radius 1/r. The usual formula for 
the derivative of a product yields 


D?(x4a,h) = ¥) Cx? !D* a, D'h 


with j+k+1=p. (Addition is componentwise, and the coefficients cj, 
are variations of binomial coefficients, determined universally by p and q.) 
The derivatives D'h are uniformly bounded on a given neighborhood of 
the origin. We have 


(D*a,)(x) = r\(D*a) (rx), 


and hence D*a, is bounded by r! times a bound for the derivatives of « 
itself, up to order m. In the circle of radius 1/r we have 


|x4 | < 1/ria — 1 /rial— ld, 


But |k| <m—|j| <|q|—|J|. This proves that D?(x‘%a,h) tends to 0 as 
r— oo, and concludes the proof of our theorem. 


More generally, by a similar technique, one can prove that a distribu- 
tion with compact support is equal to DT;, where f is a continuous 
function, D is some differential operator, and T; is the distribution given 
by T,(~) =| fo du. Our intent here is not to give an exhaustive theory, 
but merely to give the reader a brief acquaintance and feeling for func- 
tionals depending on a more involved topology than that of a norm, 
taking into account partial derivatives. For a concise and very useful 
summary of other facts, cf. the first two chapter of HOrmander [H6], and 
also Palais [Pa 1]. 
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We conclude with a remark which is technically useful. One can 
define distributions on a torus (R” modulo Z”) using C® functions on R" 
which are periodic. The advantage of doing this lies in the fact that in 
this case, every distribution has compact support, and estimates become 
easier to make. One can use certain open subsets of the torus, as local 
domains replacing open sets in euclidean space, and thus one avoids 
certain quasipathological types of distributions arising from open subsets 
in R", due to exceptional growth along the boundary. In many cases, it 
is worth paying the price of periodicity to achieve this. For an exposi- 
tion along these lines, cf. [Pa 1]. 


CHAPTER XIl 


Integration on Locally 
Compact Groups 


This chapter is independent of the others, but is interesting for its own 
sake. It gives examples of integration in a different setting from eu- 
clidean space, for instance integration on a group of matrices. For an 
application of integration and some functional analysis to compact 
groups, see Exercise 11. 


XII, §1. TOPOLOGICAL GROUPS 


A topological group G is a topological space G together with a group 
law, 1.e. maps 


GxG-G and G-G 


which define a group law and the inverse mapping in the group, such 
that these maps are continuous. After this section, 1.e. from §2 to the end 
of the chapter, we assume always in addition that G is Hausdorff. 


Examples. (1) Euclidean space R?’ is a topological group under 
addition. 

(2) The multiplicative group R* of non-zero real numbers under multi- 
plication. Similarly, the multiplicative group of non-zero complex num- 
bers C* under multiplication. 

(3) The group of non-singular n x n matrices Mat,(R) or Mat,(C) un- 
der multiplication. 

(4) The group SL,(R) or SL,(C) of matrices having determinant 1, 
under multiplication. 
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(5) The Galois group of the algebraic numbers over the rational num- 
bers, with the Krull topology. 

(6) The additive group of p-adic numbers. 
(If you don’t know these last two examples, don’t panic; forget about 
them. They won’t be used in this book.) 


In the non-commutative case, we write the group multiplicatively as 
usual. In the commutative case, we write it either multiplicatively or 
additively, depending on situations. 

If G is a topological group, and ae G, then we get a map 


T,.G-G 


called translation by a, and defined by t,x = ax. More accurately we call 
this left translation by a. Multiplication being continuous, it is clear that 
Tt, is continuous, and is in fact a homeomorphism since it has an inverse, 
namely translation by a~’. 

The map x++x™" is also a homeomorphism of G onto itself. 

Let e be the unit element of G. Let U be an open set containing e. If 
aeéG, then aU is an open set containing a. If V is an open set contain- 
ing a, then a~'V is an open set containing e. Thus neighborhoods of e 
and neighborhoods of any point in G differ only by translation. 

The technique of (¢,6) in metric spaces can be used in topological 
groups by using translations, to give a uniform way of describing close- 
ness. For instance, if a, b€G and U is an open neighborhood of e, we 
can say that a, b are U-close if ae bU. This relation is symmetric if we 
can select U to be symmetric, ic. U = U~! (where for any set S in G, the 
set S~1 is the set of all elements x7! with xeS). This can always be 
done: if V is an open neighborhood of e in G, then VNAV™ is a 
symmetric open neighborhood. 

The «/2 technique can also be used: given an open neighborhood U of 
e, there exists an open neighborhood V of e such that VV=V?cU. 
Indeed, the map G x G-—G being continuous, the inverse image of U is 
open in G x G and contains an open set W x W’ containing (e, e). We 
let V=WoOW’. Similarly, we can find an open neighborhood V of e 
such that VV7' c U. 

We have the usual uniformity statements for continuous maps on 
compact sets. Let S be a subset of G and f:S—E be a map into a 
normed vector space. We say that f is (left) uniformly continuous on S if 
given e, there exists a neighborhood U of e such that for all x, ye S with 
y~'x e U we have 


f(x) — f(y)| < €. 


Proposition 1.1. If S is compact and f:S—E is continuous, then f is 
uniformly continuous. 
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Proof. Just the same as when S is in a metric space. For each xeS 
we can find an open neighborhood U,, of e such that if ye xU,, then 


If(y) — FO) < €. 


Let V, be an open neighborhood of e such that V2 < U,. There exists a 
finite covering {x,V,,,....x,V,} of S with x,,...,x,¢S. Let 


Let x, ye S and suppose that xe yV. We have yex,V,, for some i, and 
hence 


xEXVV —x,U,, 
so that 


F(x) — FO) SFO) — FDI + IFO) — FO)! 


< 28, 
thus proving our assertion. 


Proposition 1.2. If A, B are compact sets, then the product AB is 
compact. 


Proof. AB is the image of A x B under the continuous composition 
law of G. 


Proposition 1.3. Let A be a subset of G. Then the closure of A 
satisfies _ 
A=() AV, 
V 


the intersection being taken over all open neighborhoods V of e. 


Proof. Let x¢éA. For any open neighborhood V of e the open set 
xV~! contains x and hence intersects A, i. there is some ye A such that 
y=xv! with some ve V. Then x= yv and xeAV, so we get one 
inclusion. Conversely, if x is in all AV, then xV~ intersects A for all V, 
whence x lies in the closure of A. This proves our assertion. 


Proposition 1.4. Let H be a subgroup of G. The closure H is a 
subgroup. 


Proof. This is proved purely formally using the fact that translations 
and the inverse map are homeomorphisms. Namely, if h e H, then 


hH=HcH 
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and Hc h7'H. Since h7'H is closed it follows that H < h7'H, whence 
hH -—H. Hence HH CH. If xe H, then Hx cH and Hc Hx“, which 
is closed. Hence 


Hc Hx"! 


and therefore Hx < H so that HH c H. Similarly, H~' = H c H, so that 


He H", and since H™ is closed, we get Hc H™', whence H!' cH. 
Thus H is a subgroup. 


Example. If we take H = {e}, then H is the smallest closed subgroup 
of G. By what we saw concerning the closure of a set, it is equal to the 
intersection of all open neighborhoods of e. 


We now consider coset spaces and factor groups. Let H be a sub- 
group of G. We have the set of left cosets {xH}, x € G, which we denote 
by G/H, and a natural map 


1: G > G/H, 


which to each xé€G associates the left coset xH. We give G/H the 
topology having the minimum amount of open sets making z continuous. 
Thus a subset W in G/H is defined to be open if and only if 2~'(W) is 
open in G. We have the following characterization of open sets in G/H: 


Proposition 1.5. A subset of G/H is open if and only if it is of the form 
m(V) for some open set V in G. 


Proof. If W is open in G/H, then W = n(x71(W)) and 2~1(W) is open. 
Conversely, if V is open in G, then 


n '(n(V)) = VH 
Is open, so 2(V) is open. 


In particular, we see that the map z: G > G/H is an open mapping, 1.e. 
maps open sets onto open sets. All open subsets of G/H may be written 
in the form VH/H with V open in G. In particular, G/H is locally 
compact. 


Proposition 1.6. If K’ is a compact subset of G/H, then there exists a 
compact subset K of G such that K' = n(K). 


Proof. Let A be a compact neighborhood of e in G. Then z(A) is a 
compact neighborhood of the unit coset in G/H (it is compact and we 
know that z is an open mapping). Let x,, ...,x, be elements of G such 
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that the sets z(x;A) cover K’. Let 
K =n )(K')n(x,AU°*:UX,A). 
Then K is compact and satisfies our requirements. 


The preceding property will be useful when we consider continuous 
functions on G/H. 


If we let H = @ be the closure of the identity subgroup (interesting, if 
at all, only when the set consisting of e alone is not closed), then H is 
closed and it is easily seen that every point in G/H is closed. It then 
follows that G/H is Hausdorff, because of a general property of topologi- 
cal groups, namely: 


If each point of a topological group G is a closed set, then G is 
Hausdorff. 

Proof. Let x, yeG and x # y. Let U be the complement of xy~* and 
let V be a symmetric open neighborhood of e such that VV c U. Then 
V does not intersect Vxy~', and hence Vx, Vy are disjoint open sets 
containing x and y respectively. 


As an exercise, the reader can show that if H is a subgroup of G, then 
G/H is Hausdorff if and only if H is closed. From now on, we deal only 
with Hausdorff groups, and take coset spaces or factor groups (with 
normal H) only when H 1s closed. 


Example. A subgroup H of G is said to be discrete if the induced 
topology on H is the discrete topology, ie. every point is open. Let 
G = R? and let v,, ...,v,, (m S p) be vectors linearly independent over the 
reals. Let I be the additive group of all linear combinations 


kop toot + kvm 


with integers k;. Then I is a subgroup, which is immediately verified to 
be discrete. 

The group Z of integers is a discrete subgroup of R, and the factor 
group R/Z is isomorphic to the circle group (group of complex numbers 
of absolute value 1, under multiplication), under the map 

t —> e2tit, 

The group GL(n, Z) of non-singular n x n matrices with integer com- 

ponents is a discrete subgroup of GL(n, R). 
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Xll, §2. THE HAAR INTEGRAL, UNIQUENESS 


By a (left) Haar measure on a locally compact group G (assumed Haus- 
dorff from now on) we mean a positive measure yp on the Borel sets 
which is o-regular, non-zero on any non-empty open set (of equivalently 
on any Borel set containing a non-empty open set), and which is left 
invariant, meaning that 


p(xA) = H(A) 


for all measurable sets A, and all x EG. 

We shall get hold of a Haar measure by going through positive func- 
tionals. Thus by a Haar functional we shall mean a positive non-zero 
functional 4 on C,(G) which is left invariant, 1c. 


Ataf) = A(f) 


for all fe CG). Here as usual, t,f = f, is the a-translate of f, defined 
by 
f(x) = f(a). 


Remark. Observe the presence of the inverse a~'. This is deliberate. 
We want the formula 


Tao S) = ta(t(S)) 


for a, be G, and this formula is true with the present definition. Func- 
tions form a contravariant system, i.e. if T: X > Y is a map of sets, then 
it induces a map in the reverse direction 


T*: functions on Y —> functions on X. 


We can apply this when T = 1, is the translation, thus forcing the inverse 
a~* when applying translation to functions. 


The original proof for the existence of Haar measure due to Haar 
provides the standard model for all known proofs. We shall prove the 
existence of the functional in §2, following Weil’s exposition [W]. Here, 
we discuss the relation between the measure and the functional; we prove 
uniqueness and give examples. 

First we prove a lemma which shows that a locally compact group 
has a certain o-finiteness built into it. We recall that a set is called 
o-compact if it is a countable union of compact sets. 


Lemma 2.1. Let G be a locally compact group. Then there exists an 
open and closed subgroup H which is o-compact. 
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Proof. Let K be a symmetric compact neighborhood of e. Then the 
sets K" = KK -:: K (n times) are compact neighborhoods of e, and form 
an increasing sequence since ee K" for all n. Let H = K® be the union 
of all K" for all positive integers n. Then H is o-compact. Furthermore, 
H is a subgroup (obvious). Next, H is open, because if x eH, then 
x € K" for some n, whence 


xK < K"*' CH, 


and H is open. All cosets of H are open, and we can write G as a 
disjoint union of cosets of H. Then H itself is the complement of an 
open set (union of all cosets 4 H), whence H is closed. This proves our 
lemma. 


In view of the lemma, we can write 


G=\|)x,;H 


iel 


for i in some indexing set J, and H is open, closed, and o-compact. Let 
u be a Haar measure. By the remarks following Theorem 2.3, Chapter 
14, it follows that the measure on each coset x,H is regular. If A is an 
arbitrary measurable set, then we can write 


and the A; are disjoint. If we have proved the uniqueness of Haar 
measure on H (i.e. the fact that two Haar measures differ by a constant 
> 0), then the reader will verify easily that the uniqueness follows for G 
itself. 


Theorem 2.2. If u is a Haar measure, then for any f €¢ #*(u) and any 
ae G we have 


| (ax) du(x) -| F(x) du). 
G G 


In particular, the functional du on C,(X) is left invariant, and therefore a 
Haar functional. 


Proof. If @ is any step function then by linearity we see that 


| cao du = | p dy. 
G G 


Let {,} be a sequence of step functions converging both L* and almost 
everywhere to some f in #'(u). Then {t,9,} converges almost every- 
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where to t,f. On the other hand, {t,@,} is immediately seen to be 
L'-Cauchy because we have remarked that the integral of t,@ is the 
same as the integral of » for any step function @. The first assertion of 
our theorem follows at once. (Note: This is the same proof we gave for 
the invariance of Lebesgue measure on euclidean space.) 


Theorem 2.3. Let » and v be Haar measures on G. Let du and dv be 
the functionals on C,(G) associated with uw and v. Then there exists a 
number c > 0 such that dv = c- du. 


Proof. By a previous remark, we may assume that G is o-compact, 
and hence o-finite with respect to our Haar measures. We shall apply 
Fubini’s theorem and refer the reader to Chapter IX, §6. 

We shall first give a simpler proof when the Haar measure is also 


right invariant (which applies for instance when G is commutative). Let 
he CG) be a positive function such that 


| hdyp = 1. 
G 


We can find such h by first selecting a non-empty open set V with 
compact closure, a function hy such that V < hy, and then multiply hy by 
a suitable constant. (The symbol < was defined in §2 of Chapter IX.) 
For an arbitrary f € C.(G) we have: 


| fdv= | hdu | f dv -| | h(y) f(x) dv(x) du(y) 
G G G GdJG 

= | ; I. h(y) f(xy) dv(x) du(y) 

Hs | , [. h(y) f(xy) du(y) dv(x) 

= | . [. h(x~"y) f(y) du(y) dv(x) 


-| | h(x~*y) f(y) dv(x) du(y) 
G JG 


where 


c=c(v)= | h(x~*) dv(x). 
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This proves our theorem in the present case. It also gives us an explicit 
determination of the constant c involved in the statement of theorem. 
The proof of uniqueness when the Haar measure is not also right 
invariant is slightly more involved, and runs as follows. For each non- 
zero positive function feéC,(G), we consider the ratio of the integrals 


(taken over G): 
[ra 


r(f) ==-—. 
| fas 


It will suffice to show that this ratio is independent of f. We select a 
positive function h = hy with support equal to a compact neighborhood 
K of the origin, such that h is symmetric [i.e. h(x) = h(x") for all x], 


and also 
|) dv = 1. 


We can obviously satisfy these conditions with K arbitrarily small, i.e. 
contained in a given open neighborhood of the origin. (To get the 
symmetry, use a product h(x) = w(x)w(x~') where w has small support, 
and to normalize the integral, multiply by a suitable positive constant.) 
Now for any f as above, we consider the difference 


| h du | f dv — | dy {7 du = | | [h(x) f(y) — hy) f(x) J du(x) dv(y). 


We change x to yx in the second term. We change x to y ‘x in the first 
term, and then replace y~'x by x~'y using the symmetry of h to get 
h(x~'y)f(y) in the first term. We reverse the order of integration for the 
first term, change y to xy and change the order of integration once more. 
We then see that our difference is equal to 


| | h(y)L£(xy) — f(yx)] du(x) dv(y). 


We now estimate this integral. If K is small enough, then 


| f(xy) — f(yx)| < € 


for all x € G and all ye K. Furthermore if y e K, the function 


xt f(xy) — f(yx) 
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has support in the set (supp f)K~’ U K ‘(supp f), which is compact, and 
whose p-measure is bounded by a fixed number C, depending only on f, 
as K shrinks to the origin. Since h is positive, we get the estimate (using 
the fact that {h dv = 1): 


[au [ray —[ndr | pay sec; | nav =eCy 


Dividing by { f dv, we obtain 


lim | du = r(f), 


K>{e} 


with an obvious notation concerning the use of the limit symbol. The 
left-hand side is independent of f. This proves what we wanted, and 
concludes the proof of the uniqueness of Haar measure in general. 


Corollary 2.4. The map pt-duy is a bijection between the set of Haar 
measures on G and the set of Haar functionals. If w, v are Haar 
measures, then there exists c > 0 such that v = cu. 


Proof. Let A be a Haar functional. Let uw be the measure associated 
with 1 by Theorem 2.3 of Chapter IX. For any open set V we have 


wV)=sup4f for fx<V, 


and f < V if and only if f,<aV. Thus p(aV) = w(V). For any Borel set 
A we have 


y(A) = inf n(V) for V open> A, 


and we conclude similarly that u(aA) = (A), so that pw is left invariant. 

Let V be a non-empty open set. Suppose that u(V)=0. Any com- 
pact set K can be covered by a finite number of translates of V, and 
consequently p(K)=0 for all compact K. If feC(G), f 40, then 
f/\f\|< W for some open W with compact closure. It follows that 
0<Af < u(W)|f|| = 0, contradicting the non-triviality of 4. This proves 
that u(V)>0, and hence that w is a Haar measure. Thus the map 
ut>du from Haar measures to Haar functionals is surjective. The map 
is injective by the Corollary of Theorem 2.7, Chapter IX. The last state- 
ment is now clear. 


We are in a position to give examples of Haar measure and integrals. 
As is often the case, for any concretely given group, one can exhibit a 
specific Haar functional. The uniqueness theorem can then be used to 
guarantee that it is the only one possible. 
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Examples. (1) Let G=R? be the additive group of euclidean space. 
Then Lebesgue measure is the Haar measure. 

(2) Let G = R* be the multiplicative group of non-zero real numbers. 
On C,(R*) we define a functional 


fr | * p09). 
| Oi 


Thus we let u* be the measure such that du*(x) = dx/|x|. This is easily 
seen to be invariant under multiplicative translations. Namely, suppose 
that a< 0. We compute 


| * flax). 
_ Wn 


We change variables, with u = ax, du =adx. Then |x| = |ul/|a|, and 


dx _ du 


But the limits of integration oo and —oo get reversed, and we conclude 
at once that our integral is equal to 


0 d 
| fu) —, 
ul 


as desired. The case when a > 0 is even easier. 
(3) Let T be the circle group, i.e. the group of complex numbers of 
absolute value 1. The map / given by 


Af _ |. f(e?") dt 
0 


is immediately seen to be a positive functional on C,(T), and also left 
invariant, so that it is a Haar functional. 

(4) Let G be a discrete group. The measure giving value 1 to a subset 
of G consisting of one element is a Haar measure. What is its cor- 
responding functional? 


As a matter of notation, one usually writes dx instead of du(x) for a 
Haar measure or its corresponding functional. For instance, in Example 
1, we would say that dx/|x| 1s a Haar measure on R* if dx is a Haar 
measure on R. 


Other examples will be given in the exercises. 
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As Weil pointed out in [W], most of classical Fourier analysis can be 
done on locally compact commutative groups. For this and other appli- 
cations, besides [W], the reader can consult Loomis [Lo], Rudin [Ru 2] 
for commutative groups, and the collected works of Harish-Chandra for 
non-commutative groups. 

For purposes of this book, the proof of existence given in the next 
section can be omitted. 


Xll, §3. EXISTENCE OF THE HAAR INTEGRAL 
In this section we prove that a Haar functional exists. 


We let L* denote the set of all functions in C,(X,Rzo). Then L* 
is closed under addition and multiplication by real numbers 2 0. If 
f, g¢€L* and if we assume that g is not identically 0, then there exist 
numbers c; >0 and elements s;éG (i= 1,...,n) such that for all x we 
have 


f(x) $ y c;g(5;x). 


For instance, we let V be an open set and m>0 such that g 2 m>0 on 
V. We can take all c;=sup f/m and cover the support K of f by 
translates s,V, ...,s,V. We define 


(f :9) 


to be the inf of all sums )  c; for all choices of {c;}, {s,} satisfying the 
above inequality. If g = 0, we define (f:g) to be oo. The symbol (f: g) 
satisfies the following properties. The first expresses an invariance under 
translation, where f,(x) = f(a~'x). 


(1) (fa: 9) =(f: 9), 

(2) (Ath: Shi: 9) + (hr: 9), 
(3) (of: g)=c(f:g) ifc20. 

(4) If f; S fo, then (f, : 9) S$ (fo: 9), 
(5) (f:9) S(f:hj(h: 4g), 

(6) (f : g) 2 sup f/sup g. 


The first four properties are obvious. For (5), we note that if 


f(x) SYVeh(s;x) and — h(x) S ) djg(t;), 
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then | 
f(x) SD, cidjg(tj5;). 
LJ 


For property (6), let x be such that f(x) = sup f. Then 


sup f < y c;g(s;x) S ‘o3 c;) sup g 


whence (6) follows. 
Let hy be a fixed non-zero function in L*. We define 


_ (f +9) 
D> yg) 
Then we have 
1 
(7) ho) SA(f) SF * ho). 


For each fixed g, the map A, will give an approximation of the Haar 
functional, which will be obtained below as a limit in a suitable sense. 
We note that 4, is left invariant, and satisfies 


Ag fi + fo) SAg( fi) + Agfa). AK(P)=cA(f) ¢c20. 


Furthermore, we shall now prove that if the support of g is small, then 
4, is almost additive. 


Lemma 3.1. Given f,, f,¢L* and «&, there exists a neighborhood V of 
e such that 


Afi) + Aj(f2) Ss AS + fr) + € 
for all ge L* having support in V. 


Proof. Let h be in L* and have the value 1 on the support of f, + fy. 
Let 


f=fi+f.+ oh 
with a number 6 > 0. Let 
hj=f,/f and h,=f,/f. 
We use the usual convention that h, and h, are 0 wherever f is equal to 


0. Then h,, h, are in L*, and are therefore uniformly continuous. Let V 
be such that 


|ny(x) —hy(y)] <0 and {ha (x) — ha(y)| < 0 
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whenever ye xV. Let g have support in V, and assume that g #0. Let 
C1, -++,C, be positive numbers and s,, ...,s, € G be such that 


f(x) SD eig(s:x) 
for all x. If g(s,x) #0, then s,x € V. We obtain 
fil) = f(hy (x) S Y) c:g(six)hy (Xx) 
< ¥ ¢g(s;x) [hy (s;*) + 4], 
and consequently 
(fi :9) Sd ciLhi(s;*) + 6]. 


We have a similar inequality for (f, : g). Since h, + h, S 1, we obtain 


(fi 9) + (fr :9) SQ ed(t + 26). 
Taking the inf over the families {c;}, we find 
(f,:9) + (f2:9) (fF: 9) + 20) 
SUA + fog) + 0h: g)] (1 + 20). 


Divide by (hy : g). We obtain 


Ag( fi) + Ag fa) S [Agi + fo) + OAg(h)] (1 + 20) 


But by (7), we know that 1,(f, + f2) and 4,(h) are bounded from above 
by numbers depending only on f;, f, (and h, which itself depends only 
on f,, >). Hence for small 6 we conclude the proof of the lemma. 


For each non-zero f € L* we let I, be the closed interval 


1 
re fy he) |: 


Let I be the Cartesian product of all intervals I,. Then J is compact by 
Tychonoff’s theorem. Each ge L*, g #0 gives rise to a map A,, which is 
determined by its values 1,(f) for fe L*, f #0. We may therefore view 
each A, as a point of J. For each open neighborhood V of e in G let Sy 
be the closure of the set of all A, with g having support in V. Then Sy, is 
compact, and the collection of compact sets S, has the finite intersection 
property because 


Sy, t""° A Sy, — SY ae Ag 
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By compactness, there exists an element / in the intersection of all sets 
Sy. We contend that 4 is additive. This is immediate, because given ¢ 
and any neighborhood V of e, and f,, fA, RA =f, +f, we can find 
gé€L* having support in V such that 


Afi — Agh| < € 


Using Lemma 3.1, we conclude that A is additive. 
If fe L, we can write f = f, — f, with f,, f, ¢ L*. We define 


Af = Afi — 4fa- 


The additivity of 4 shows that this is well defined, ic. independent of the 
choice of f,, f,, and it is immediately verified that / is then linear on L. 
Furthermore, from the properties of 4,, we also conclude that 4 is left 
invariant, and that for any f ¢ L* we have 


1 
(ho : f) 


In particular, 4 is non-trivial. This concludes the proof of the existence 
of the Haar functional. 


SAP S (f= No). 


XIl, §4. MEASURES ON FACTOR GROUPS 
AND HOMOGENEOUS SPACES 


Let H be a closed subgroup of G and let dy be a Haar measure on H. 
Let f € C.(G). Then the function 


w| (uy) dy 


is continuous on G, as one verifies at once from the uniform continuity 
of f. Furthermore, it is constant on left cosets of H, because of the left 
invariance of the Haar measure on H, and has compact support on G/H. 
Thus if we write u for elements of G/H, there exists a unique function 
f” €C,(G/H) such that 


fu) = | f(uy) dy. 
H 
Theorem 4.1. Let H be a closed subgroup of G. The map 


frofh 
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is a linear map of C,(G) onto C(G/H). If H is normal, and du is a 
Haar measure on G/H, then the functional 


fr | f(uy) dy du 
G/H JH 


is a Haar functional on G. 


Proof. That the repeated integral is a positive functional and 1s left 
invariant is obvious. We must show that it is non-trivial. This will come 
from the first statement, valid even when H is not normal, and easily 
proved as follows. Let x: G-— G/H be the natural map. Let f’ e C,(G/H) 
and let K’ be the support of f’. We know from §1 that there exists a 
compact K in G such that x(K) = K’. Let ge C,(X) be a positive func- 
tion which is > 0 on K. Then g” will be > 0 on K’. Let 


_f(O)) » on 
h(x) = 94 (n(x)) if g”(x) > 0, 
h(x) = 0 if g#(x) =0. 


Then h is continuous on G, and is constant on cosets of H. Let f = gh. 
Then it is clear that f” = f’, thus proving our first assertion, and the 
theorem. 


When dH is not normal, we can still say something, and we cast it 1n a 
slightly more general context. Let S be a locally compact Hausdorff 
space, and G our group. We say that G operates on S if we are given a 
continuous map 


GxS-S 
satisfying the conditions 
x(yu) = (xy)u and eu =u 


for all x, ye G, ueS and e the unit element of G. Then for each xEG 
we have a homeomorphism 1,: 5S —S given by t,u = xu. 

The coset space of a closed subgroup H is an example of the above, 
because for any coset yH we can define t,(yH) = xyH. The operation is 
obviously continuous. A homogeneous space is a space on which G 
operates transitively. Such a space is the isomorphic to G/H with some 
closed subgroup H. 

Let 4 be a positive functional on C,(S). We shall say that / is rela- 
tively invariant (for the given operation of G) if for each ae G there exists 
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a number w(a) such that for all fe C,(S) we have 


A(t, f) = Wla)a(f). 


As before, we define 1, f(s) = f(a~'s) for ae G, se S. It follows immedi- 
ately that w: G— R* is a continuous homomorphism into the multiplica- 
tive group of positive reals, called the character of the functional. We 
define a relatively invariant measure similarly. 


As with Haar measures, we have 


Theorem 4.2. The map urdu is a bijection between the set of o- 
regular positive relatively invariant measures on S and the set of positive 
relatively invariant functionals on C,(S). 


In the case of the coset space, we then have the analog of Theorem 
4.1. 


Theorem 4.3. Let G be a locally compact group and H a closed sub- 
group. Let du be a relatively invariant non-zero positive functional on 
C.(G/H) with character w, and let dy be a Haar measure on H. Then 
the map 


frre | | f(uy)w (uy) dy du 
G/H JH 


is a Haar functional on G. 


Proof. Our map is obviously linear, positive, and non-trivial because 
of the first statement of Theorem 4.1. There remains to show that it is 
left invariant. But 


| ) f(auy)p~"(uy) dy du = (a) | | f(auy)p ~(auy) dy du. 
G/H JH G/H JH 


The outer integral on the right is the integral of (fy ~')” translated by 
a~' and hence by the relative invariance of du and the definition of the 
character, is the same as the integral of (fy~')” times w(a)"!. This yields 


what we wanted. 


Example 1. In Exercise 4 you will show that the multiplicative group 
C* of complex numbers is isomorphic to R* x T, where T is the circle, 
and that Haar measure on C* is given by r7! dr dé (in terms of the usual 
polar coordinates. 


[ XII, §4] MEASURES ON FACTOR GROUPS 325 


Example 2. Let G=GL(n, R) be the group of invertible real n xn 
matrices. In Exercise 5 you will show that Haar measure on G is given 
by dx/|det x| for x € G. 


Example 3. Let G be the subgroup of GL(2,R) consisting of all 


matrices 
a b 
0 1) 


In Exercise 9 you will show that right Haar measure on G is the product 
measure on R* x R, and is not equal to left Haar measure. 


Example 4. SU(2). This is the special unitary group, defined here to 
be the group of 2 x 2 complex matrices of the form 


(*, ‘| with ax + BB =1. 
—B & 


Write «=a,+ia, and B=b,+ ib, with a;, b; real. Then there is a 
bijection of R* with the real vector space of matrices 


and under this bijection, SU(2) corresponds to the sphere S° defined by 
the equation 


a? + a3 +b? + bf = 1. 


The reader can prove at once that euclidean measure on R* given by 


da, da, db, db, 
is invariant by the action of SU(2) on R*. 
Define 
r= (aj +az +b? 4+ b3)'”. 
and 


u, = a,/r, u, = a,/r, 0 = angle of f. 
Then we leave as an exercise to show that 
da, da, db, db, = r° dr du, du, dé, 


and (u,,u,,9) define a C®” isomorphism of S* on SU(2). Finally, let 
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f ¢C,(SU(2)), and let F be the corresponding function in C,(S*), where 
F = F(u,,u,, 9) is a function of the above coordinates on S°. Then the 
Haar measure on SU(2) can be expressed in terms of the coordinates 
by the formula 


1 1 J 1-u2 2n 
| fdu= ss | | | F(u,, u,, 9) dé du, du,. 
SU(2) TM J-1J-/1-u? JO 


We leave the verification as Exercise 12. This normalization gives the 
compact group SU(2) measure 1. 


XII, §5. EXERCISES 


We let G be a locally compact Hausdorff group and yp a Haar measure. 


1. Identify C with R?. Let uw be Lebesgue (Haar) measure on C. Let we C, and 
let z denote the general element of C. Show that dyu(az) = |a|? du(z), as 
functionals on C,(C) = C,(R’). 


2. Let T be the circle group and let H be a discrete abelian group which is not 
countable. Let t be a fixed element of T and let {x;} (i¢]) be a non- 
countable subset of H. Let S be the set of all pairs (t, x;), ie J. Show that S 
is discrete in T x H, and that if » is Haar measure on T x H, then S has 
infinite measure. Show that all compact subsets of S have measure 0. Thus 
Haar measure is not regular. 


3. Show that G is compact if and only if y(G) is finite. Show that G is discrete 
if and only if the set consisting of e alone has measure > 0. 


4. Let C* be the multiplicative group of complex numbers 4 0. Let R* denote 
the multiplicative group of real numbers > 0. Show that C* is isomorphic to 
R* x T (where T is the circle group) under the map (r,u)r>-ru. We write 


u = e*"®, Show that the Haar integral on C* is given by 


1 oe) 
feof \ f(re?™®) dr dé. 
0 JO r 


5. Let G=GL(n, R) be the group of real nxn matrices. Show that Haar 
measure on G is given by dx/|det x|, if dx is a Haar measure on the 
n?-dimensional space of all n xn matrices. (Use the change of variables 
formula.) 


6. Let feC(G) and let ae G. We denote by r,f or f* the right translate of f, 
defined by 


f°(x) = f(xa™). 


For each fixed a, show that there exists a number A(a) such that for all 
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10. 


11. 


f € CG) we have 


| f(xa™) dx = aia) | f(x) dx. 
G G 


Show that A(ab) = A(a)A(b), and that A is continuous, as a map of G into 
R*. We call A the modular function on G. 


. If A is the modular function on G, show that 


| f(x) A(x) dx -| f(x) dx, 
G G 


where dx is Haar measure, and that A(x~') dx is right Haar measure. 


_ If G is compact and w:G—R* is a continuous homomorphism into R*, 


show that w is trivial, ie. W(G) = 1. In this case, Haar measure is also right 
invariant. 


. Compute the modular function for the group G of all affine maps x» ax + b 


with ae R* and beR. In fact, show that A(a,b)= a. In this case the right 
Haar measure is not equal to the left Haar measure. Show that the right 
Haar measure is the Cartesian product measure on R* x R. 


Let G be a locally compact group with Haar measure, let A be the modular 

function as in Exercise 7. Let M!' be the set of regular complex measures on 

G. 

(a) Just as in Exercise 8 of Chapter IX, prove that M‘ is a Banach space, if 
we define the convolution m*m’ to be the measure associated with the 


functional 
fr | | f(xy) dm(x) dm'(y), 
for f € C,(G). 
(b) For me M’, define mY” to be the direct image of m under the mapping 
xt+x7! of G. Show that ||m’ || = ||ml. 


(c) If uw is a right measure, prove that w’ = Au. 


Let G be a compact abelian group. By a character of G we mean a con- 
tinuous homomorphism w: G > C* into the multiplicative group of non-zero 
complex numbers. 

(a) Show that the values of w lie on the unit circle. 

(b) If G=R"/Z" is an n-torus, show that the characters separate points. 
Assume this for the general case. 

(c) Let o:G—G be a topological and algebraic automorphism of G, or an 
automorphism for short. Show that o preserves Haar measure, and in- 
duces a norm-preserving linear map T: L?(u) > L?(u) by frerfoa. 

(d) If W is a non-trivial character on G, show that { y du =0. If w is trivial, 
that is ¥(G) = 1, then [ ¥du =1, assuming that y(G) = 1, which we do. 
Prove that the characters generate an algebra which is dense for the sup 
norm in the algebra of continuous functions. Prove that the characters 
form a Hilbert basis for L?. 
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(ec) Let f be an eigenvector for T in L*, with eigenvalue a, that is Tf = af. 
Show that a is a root of unity. [Hint: First, |«|= 1. Then write the 
Fourier expansion F = )\c,w in L?, and observe that cr, =a ‘cy. Use 
the fact that }’ |c,|? < oo whence if c, #0, then for some n, T"y = W and 
a" = 1.] 


12. Prove the statements made in Example 4, concerning Haar measure on SU(2). 
You may use the following trick. Let he C,(R*) be such that 


\. h(r)r? dr = 1. 


0 
Show that the functional 
A(F) = | F(a,/r, a,/r, 0)h(r) da, da, db, db, 
R4 


1 J 1-u2 2n 
-| | \ F(u,, uz, 9) dé du, du, 
-1 J-/1-u2 Jo 


defines a functional on C(S°) which is invariant under the action of SU(2). 


PART FOUR 


Calculus 


The differential calculus is an essential tool, and some of it will be used 
in some of the later applications. Readers can bypass it until they come 
to a place where it is used. The differential calculus in Banach spaces 
was developed quite a while ago, in the 1920s, by Fréchet, Graves, 
Hildebrandt, and Michael. Its recent return to fashion, after a period 
during which it was somewhat forgotten, is due to increasingly fruitful 
applications to function spaces in various contexts of analysis and 
geometry. 

For instance, at the end of Chapter XIV, in the exercises, we show 
how the calculus in Banach spaces can be used by describing the start of 
the calculus of variations. 

In functional analysis, one considers at the very least functions from 
scalars into Banach or Hilbert spaces, giving rise to curves (real or 
complex) in such spaces. 

We shall also use the elementary integral of Banach-valued continuous 
functions in the spectral theory of Chapter XVI, §1, in connection with 
Banach-valued analytic functions. 

The Morse—Palais lemma of Chapter XVIII gives a nice illustration of 
the second derivative of a map in Hilbert space. 

In the part on global analysis, readers will see the calculus used 
in dealing with finite dimensional manifolds, especially for integration 
theory, as in Stokes’ theorem. For applications to infinite dimensional 
manifolds, see [La 2]. 

As a matter of exposition, we prefer to develop ab ovo the integral of 
step maps from an interval into a Banach space, and its extension to the 
uniform closure of the space of step maps, which can be done much 
more easily than the construction of the L* general integral. 


CHAPTER  XIll 


Differential Calculus 


Throughout this chapter and the next, we let E, F, G denote 
Banach spaces. 


XII, §1. INTEGRATION IN ONE VARIABLE 


Let [a,b] be a closed interval, and E a Banach space. By a step map 
f: [a,b] ~ E we mean a map for which there exists a partition 


P:a=a)Sa,8°''Sa,=b 


and elements v,, ...,v, € E such that if a,_,<t<a,, then f(t)=v;. We 
then say that f is step with respect to P. The notion of a refinement of a 
partition is the usual one, and if f, g are two step maps of [a, b] into E, 
then there exists a partition P such that both f, g are step with respect 
to P. From this we see that the step maps form a subspace of the space 
of all bounded maps, and we deal with the sup norm on this space. 

We define the integral of a step map f with respect to a partition P by 


IPp(f) = y (a; — 4;-1)0;, 


the notation being as above. This is in fact independent of P, and we 
write simply I(f) or I?(f) to specify the interval [a,b]. It is then easily 
seen that J is linear, and that |J(f)| s(b—a)||f||, so J is continuous, 
with bound b — a. We can therefore extend J to the closure of the space 
of step maps by the linear extension theorem. If f lies in this closure, we 
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denote I(f) from now on by 


b 
| 
and call it the integral. If asc <b, then one verifies without difficulty 
that 


(1) | refre|s 


Ifasxc<d<b, we define 
c d 
[r=-[ 4 
d c 


Then formula (1) actually holds for any three points a, b, c in any order, 
lying in an interval on which f is in the closure of the space of step 
maps. 

Since a continuous map is uniformly continuous on a compact set, 
one concludes that the continuous maps of [a, b] into E lie in the closure 
of the space of step maps, so that the integral is defined over continuous 
maps. 

If E=E, x--: x E, is a product of Banach spaces, and 


F = (fis -++stn) 


is ‘represented by coordinate maps f;:[a,b]—E;, then it is trivially 


verified that 
[t= (| fon], 


If E =R, and f 2 0, then 
b 
[Pree 


as one sees first for step maps, and then by continuity for uniform limits 
of step maps. 

For convenience, the closure of the space of step maps will be called 
the space of regulated maps. Thus a map is called regulated if it is a 
uniform limit of step maps. 
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Proposition 1.1. Let 4: E— F be a continuous linear map. If f: [a,b] E 
is regulated, then 4° f is regulated and 


[,2es-a([) 


This follows immediately from the definitions. Indeed, if f is the uni- 
form limit of a sequence of step maps { f,}, then each Jo f, is a step map 
of [a, b] into F, which clearly converges to Ao f. For a step map f, we 
have directly from the definition that 


[eoa() 


Taking the limit proves our formula. 


Xill, §2. THE DERIVATIVE AS A LINEAR MAP 


Let U be open in E, and let xe U. Let f: U-—F be a map. We shall 
say that f is differentiable at x if there exists a continuous linear map 
4: E—-F and a map w defined for all sufficiently small h in E, with 
values in F, such that 


lim w(h) = 0, 
h-0 
and such that 
(*) f(x + h) = f(x) + Ah) + [hl (A). 


Setting h = 0 shows that we may assume that w is defined at 0 and that 
W(0) = 0. The preceding formula still holds. 

Equivalently, we could replace the term |h|W(h) by a term g(h) where 
@ 1s a map such that 


The limit is taken of course for h #0, otherwise the quotient does not 
make sense. 

A mapping having the preceding limiting property is said to be o(h) 
for h- 0. 

We view the definition of the derivative as stating that near x, the 
values of f can be approximated by a linear map 4, except for the 
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additive term f(x), of course, with an error term described by the limiting 
properties of w or @ described above. 

It is clear that if f is differentiable at x, then it is continuous at x. 

We contend that if the continuous linear map 4 exists satisfying (*), 
then it is uniquely determined by f and x. To prove this, let 4,, A, be 
continuous linear maps having property (*). Let ve E. Let t have real 
values > 0 and so small that x + tv lies in U. Let h=tv. We have 


F(x + h) — f) = 41 (A) + [hl (A) 


= A,(h) + |h| W2(A) 
with 
lim w,(h) = 0 
h>0 


for j=1,2. Let A=, —A,. Subtracting the two expressions for 


f(x + tv) — f(x), 
A, (h) — A2(h) = |hl (Wo(h) — 1 (h)), 


we find 


and setting h = tv, using the linearity of A, 


t(2,(v) — Ag(v)) = t lv] (Wo(tv) — Ys(t0)). 
We divide by t and find 


A,(v) — A(v) = |v| (W2(tv) — Wy (tv). 


Take the limit as t-0. The limit of the right side is equal to 0. Hence 
4,(v) — 4,(v) = 0 and A,(v) = 4,(v). This is true for every v € E, whence 
A, = 42, as was to be shown. 

In view of the uniqueness of the continuous linear map 4, we call it 
the derivative of f at x and denote it by f’(x) or Df(x). Thus f’(x) is a 
continuous linear map, and we can write 


f(x + h) — f(x) = f'(x)h + [hl wh) 
lim w(h) = 0. 


h-0O 


with 


We have written f’(x)h instead of f’(x)(h) for simplicity, omitting a set of 
parentheses. In general we shall often write 


Ah 


instead of A(h) when 4 is a linear map. 
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If f is differentiable at every point x of U, then we say that f is 
differentiable on U. In that case, the derivative f’ is a map 


Df = f': U > L(E, F) 


from U into the space of continuous linear maps L(E, F), and thus to 
each xe U, we have associated the linear map f’(x)e L(E, F). If f’ 1s 
continuous, we say that f is of class C’. Since f’ maps U into the 
Banach space L(E, F), we can define inductively f to be of class C? if all 
derivatives D*f exist and are continuous for 1 < k < p. 

If f: [a,b] > F is a map of a real variable, then its derivative 


f'): RF 


is a linear map into the vector space F. However, if 4: R—F is any 
linear map, then for all te R we have 


A(t) = A(t: 1) = ta(1). 


Hence / is multiplication (on the right) by the vector A(1) in F, and we 
usually may identify 4 with this vector. 


XIll, §3. PROPERTIES OF THE DERIVATIVE 


Sum. Let E, F be complete normed vector spaces, and let U be open in 
E. Let f, g: U-F be maps which are differentiable at xe U. Then 
f +g is differentiable at x and 


(f + g)(x) = f(x) + 9'(*). 
If c is a number, then 
(cf) (x) = of “(x). 
Proof. Let 4, = f'(x) and 4, = g(x) so that 


f(x +h) — f(x) =A,h + [hl Yih), 
g(x + h) — g(x) = A,h + [hl w2(h), 


where lim w,(h) = 0. Then 
h-0 
(f + g)(x + h) —(f + g)(x) = f(x + h) + g(x + h) — f(x) — 9) 


= Ash + Agh + |h\(W,(h) + W(h) 
= (A, + A2)(h) + [h\ (Wy (A) + W2(h)). 
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Since lim (W,(h) + W2(h)) = 0, it follows by definition that 
h-O 
Ay +A, =(f + gy), 
as was to be shown. The statement with the constant is equally clear. 


Product. Let F,, F,, G be complete normed vector spaces, and let 
F, x F, >G be a continuous bilinear map. Let U be open in E and let 
f:U—>F, and g: U-F, be maps differentiable at xeU. Then the 
product map fg is differentiable at x and 


(fg) (x) = fF’ (x)g(x) + fg"). 


Before giving the proof, we make some comments on the meaning of 
the product formula. The linear map represented by the right-hand side 
is supposed to mean the map 


vi->(f'(x)v) g(x) + f(x)(9'()0). 


Note that f’(x): E> F, is a linear map of E into F,, and when applied to 
véE yields an element of F,. Furthermore, g(x) lies in F,, and so we 
can take the product 


(f'(x)v)g(x) € G. 


Similarly for f(x)(g’(x)v). In practice we omit the extra set of parenthe- 
ses, and write simply 


f'(x)vg(x). 
Proof. Changing the norm on G if necessary, we may assume that 


|ow| S |v| |W 
forve F,, weF,. 


We have: 


F(x + h)g(x + h) — f(x)g) 
= f(x + h)g(x + h) — f(x + h)g(x) + fx + hg(x) — fxg) 
= f(x + h)(g(x + h) — g(x) + (f( + h) — FO) 9%) 
= f(x + W(g'(x)h + |hlyo(h) + (F'Cdh + hl) 900) 
= f(x + hg'(x)h + lhl f(x + h)pa(h) + f'x)hg(x) + [Aly ge) 
= f(x)g'(xph + f'(x)hg(x) + (F& + A) — fe) g’Coh 
+ |h| f(x + h)a(h) + lhl, Ag). 
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The map 
ht f(x)g'(x)h + f'(x)hg(x) 


is the linear map of E into G, which is supposed to be the desired 
derivative. It remains to be shown that each of the other three terms 
appearing on the right are of the desired type, namely o(h). This is 
immediate. For instance, 


(f(x +h) — f(x))g'(x)hl S If +h) — FOI l9’OOI hI 
and 
lim [f(x + h) — f(x) lg'(x)| = 0 


because f is continuous, being differentiable. The others are equally 
obvious, and our property is proved. 


Quotient. Assume that A is a Banach algebra with unit e, and let U be 
the open set of invertible elements. Then the map ut>u™' is differenti- 
able on U, and its derivative at a point ug is given by 


vr> —Uug!vug!. 
Proof. We have 


(uo +h) — up? = (uo(e + uo*h))* = uo’ 
=(e + ugh) tug’ — ug" 
= (e — upth + o(h))up) — up? 
= —uo'hug' + o(h). 


This proves that the derivative is what we said it is. 


Chain Rule. Let U be open in E and let V be open in F. Let f: U>V 
and g:V >G be maps. Let x EU. Assume that f is differentiable at x 
and g is differentiable at f(x). Then go f is differentiable at x and 


(go fY(x) = g'(F(x)) 0 f'(). 


Before giving the proof, we make explicit the meaning of the usual 
formula. Note that f’(x): E—F is a liner map, and g’(f(x)): F>G is a 
linear map, and so these linear maps can be composed, and the compos- 
ite is a linear map, which is continuous because both g’(f(x)) and f’(x) 
are continuous. The composed linear map goes from E into G, as it 
should. 
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Proof. Let k(h) = f(x + h) — f(x). Then 
g(f(x + h)) — g(f(%) = a'(F(X)) kD) + [kW (A(H)) 
with lim ¥,(k) = 0. But 
k(h) = f(x + h) — f(x) = f'@)h + [hl o(h), 


with lim w,(h) = 0. Hence 
h-O 


g(f(x + h)) — g(f)) 
= 9'(f(x))f' (xh + hig’ (Fe) ba(h) + [k(h)| Ys (K(A)). 
The first term has the desired shape, and all we need to show is that 


each of the next two terms on the right is o(h). This is obvious. For 
instance, we have the estimate 


[k(h)| S [F'OCONAL + [Al lW2(A)| 
and 
lim y,(k(h)) = 0 
h-0 


from which we see that |k(h)|W,(k(h)) = o(h). We argue similarly for the 
other term. 


Map with Coordinates. Let U be open in E, let 
f:Uv~F, x: x F,, 
and let f =(f,,.--,f,) be its expression in terms of coordinate maps. 
Then f is differentiable at x if and only if each f, is differentiable at x, 
and if this is the case, then 
f'(x) = (F100, --- fa). 
Proof. This follows as usual by considering the coordinate expression 


f(x + h) — f(x) = (A(x + h) — £00), .- fil +h) — fn). 


Assume that f;'(x) exists, so that 


fix + h) — fi) = fih + eh) 


[ XIII, §3] PROPERTIES OF THE DERIVATIVE 339 


where ¢,(h) = o(h). Then 


f(x +h) — f(x) = (F10)h, Sin CDK) + (1h), --- P(A) 


and it is clear that this last term in F, x --: x F, is o(h). (As always, we 
use the sup norm in F, x -:: x F.,.) This proves that f’(x) is what we said 
it is. The converse is equally easy and is left to the reader. 


Theorem 3.1. Let 4:E—F be a continuous linear map. Then i is 
differentiable at every point of E and A'(x) = 4 for every x€ E. 


Proof. This is obvious, because 
A(x + h) — A(x) = Ath) + 0. 
Note therefore that the derivative of J is constant on E. 


Corollary 3.2. Let f: U—-F be a differentiable map, and let 4: F > G 
be a continuous linear map. Then for each x € U, 


(A 0 fy (x) = A(f'(x)), 
so that for every ve E we have 


(Ao fy(x)v = A(f'(x)v). 


Proof. This follows from Theorem 3.1 and the chain rule. Of course, 
one can also give a direct proof, considering 


A( f(x + h)) — AFC) = ACP + h) — FO) 
= A(f'(x)h + |hlw(h)) 
= A(f'(x)h) + [hl ACW), 


and noting that lim (w(h)) = 0. 
nO 


Lemma 3.3. If f is a differentiable map on an interval [a,b] whose 
derivative is 0, then f is constant. 


Proof. Suppose that f(t) # f(a) for some te[a,b]. By the Hahn-— 
Banach theorem, let A be a functional such that A(f(t)) 4 A(f(a)). The 
map /o f is differentiable, and its derivative is equal to 0. Hence jo f is 
constant on [a, b], contradiction. 
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Fundamental Theorem of Calculus. Let f be regulated on [a,b], and 
assume that f is continuous at a point c of [a,b]. Then the map 


| f = o(t) 


is differentiable at c and its derivative is f(c). 


Proof. The standard proof works, namely 


(e+ W—o(0= | f 


and 


c 


p(c + h) — e(c) — hfe) = | (f — f(o). 


c 


The right-hand side is estimated by 


|h| sup | f(t) — FO) 


for t between c and c + h, thus proving that the derivative is f(c). 
In particular, 


f(b) — f(a) = | f(t) dt. 


Xlll, §4. MEAN VALUE THEOREM 


The mean value theorem essentially relates the values of a map at two 
different points by means of the intermediate values of the map on the 
line segment between these two points. In vector spaces, we give an inte- 
gral form for it. 

We shall be integrating curves in the space of continuous linear maps 
L(E, F). 

We shall also deal with the association 


L(E, F) x E> F 
given by 
(A, y)F> A(y) 
for Ae L(E, F) and ye E. It is a continuous bilinear map. 


Let a: J > L(E, F) be a continuous map from a closed interval J = 
[a,b] into L(E, F). For each te J, we see that a(t) e L(E, F) is a linear 
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map. We can apply it to an element ye E and a(t)ye F. On the other 
hand, we can integrate the curve a, and 


| a(t) dt 


is an element of L(E, F). If « is differentiable, then da(t)/dt is also an 
element of L(E, F). 


Lemma 4.1. Let a: J—L(E, F) be a continuous map from a closed 
interval J = [a, b] into L(E, F). Let ye E. Then 


| a(t)y dt = i" a(t) dt-y 


a a 


where the dot on the right means the application ofthe linear map 
b 
| a(t) dt 


Proof. Here y is fixed, and the map 


to the vector y. 


At+ My) = Ay 


is a continuous linear map of L(E, F) into F. Hence our lemma follows 
from the last property of the integral proved in §1. 


Theorem 4.2. Let U be open in E and let xeU. Let yeE. Let 
f:U—->F be a C* map. Assume that the line segment x +ty with 
0 <t <1 is contained in U. Then 


fix+ y= fo) =| f'oe+ tyydt= | f'(x + ty) dt-y. 


Proof. Let g(t)= f(x + ty). Then g'(t)= f’(x + ty)y. By the funda- 
mental theorem of calculus we find that 


1 


g(1) — g(0) = | g'(t) dt. 


0 


But g(1) = f(x + y) and g(O) = f(x). Our theorem is proved, taking into 
account the lemma which allows us to pull the y out of the integral. 
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Corollary 4.3. Let U be open in E and let x, z€U be such that 
the line segment between x and z is contained in U (that is the segment 
x +t(z —x) withO<t<1). Let f: U-F be of class C*’. Then 

|f(z) — f(x)| S|z — x| sup | fe), 


the sup being taken for all v in the segment. 


Proof. We estimate the integral, letting x + y =z. We find 


S (1 — 0) sup|f’(x + ty)|lyI, 


| f'(x + ty)y dt 
0 


the sup being taken for 0 <t <1. Our corollary follows. 


(Note. The sup of the norms of the derivative exists because the seg- 
ment is compact and the map tr>|/f’(x + ty)| is continuous.) 


Corollary 4.4. Let U be open in E and let x, z, x»¢€U. Assume that 
the segment between x and z lies in U. Then 


f(z) — f(x) — f'(%o)(2 — x)| S$ |z — x] sup | f"(v) — f'(o)I, 
the sup being taken for all v on the segment between x and z. 


Proof. We can either apply Corollary 4.3 to the map g such that 
g(x) = f(x) — f’ (X9)x, or argue directly with the integral: 


1 
f(z) — f(x) = | f'(x + t(z — x))(z — x) dt. 
We write 
f'(x + t(z — x)) = f'(x + t(z — x) — f'(X%o) + f'(%o), 


and find 


F(z) — f(x) = f'(%o)(z — x) + | [f(x + tz — x)) — f'%o)1@ — x) dt. 


We then estimate the integral on the right as usual. 


We shall call Theorem 4.2 or either one of its two corollaries the 
mean value theorem in vector spaces. In practice, the integral form of the 
remainder is always preferable and should be used as a conditioned 


[ XIII, §5] THE SECOND DERIVATIVE 343 


reflex. One big advantage it has over the others is that the integral, as a 
function of y, is just as smooth as f’, and this is important in some 
applications. In others, one needs only an intermediate value estimate, 
and then Corollary 4.3, or especially Corollary 4.4, may suffice. 


XII, §5. THE SECOND DERIVATIVE 
Let U be open in E and let f: UF be differentiable. Then 
Df = f’: U > L(E, F) 


and we know that L(E, F) is again a complete normed vector space. 
Thus we are in a position to define the second derivative 


D?*f = f™: U > L(E, L(E, F)). 
We have seen in Chapter IV, §1 that we can identify L(E, L(E, F)) with 
L(E, E; F), which we denote by L?(E, F), ic. the space of continuous 


bilinear maps of E into F. 


Theorem 5.1. Let w: E, x E,-—F be a continuous bilinear map. Then 
w is differentiable, and for each (x,,x,)¢€E, x E, and every 


(v,,0,)€E, x E, 
we have 


Do(X1, X2)(V1, V2) = @(X,, V2) + W(0,, X2), 


so that Dw: E, x E, > L(E, x E, F) is linear. Hence D*w is constant, 
and D*w = 0. 


Proof. We have by definition 

a(x, + hy, x2 + hy) — w(x1, Xz) = @(%1, ha) + w(h,, x2) + afhy, ha). 
This proves the first assertion, and also the second, since each term on 
the right is linear in both (x,,x,)=x and h=(h,,h,). We know that 
the derivative of a linear map is constant, and the derivative of a con- 
stant map is 0, so the rest is obvious. 


We consider especially a bilinear map 


A.Ex EoF 
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and say that / is symmetric if we have 
A(v, w) = A(w, v) 
for all v, we E. In general, a multilinear map 
ALE X:+::x E>F 
is said to be symmetric if 


Av, gece 7) = A(Ve1y> eee Ven) 


for any permutation o of the indices 1,...,n. In this section we look at 
the symmetric bilinear case in connection with the second derivative. 

We see that we may view a second derivative D7f(x) as a continuous 
bilinear map. Our next theorem will be that this map is symmetric. We 
need a lemma. 


Lemma 5.2. Let 1: E x E-F be a bilinear map, and assume that there 
exists a map wW defined for all sufficiently small pairs (v, w)€ E x E with 
values in F such that 


lim wW(v, w) =O, 


(v, w) (0,0) 
and that 
|A(v, w)| S |W(v, w)| |v] |wI. 
Then A = 0. 


Proof. This is like the argument which gave us the uniqueness of the 
derivative. Take v, we E arbitrary, and let s be a positive real number 
sufficiently small so that (sv, sw) is defined. Then 


|A(sv, sw)| S |(sv, sw)||sv| [sw], 
whence 
s*|A(v, w)| S s*|W(sv, sw)| |v] |wI. 
Divide by s* and let s—>0. We conclude that A(v, w) = 0, as desired. 
Theorem 5.3. Let U be open in E and let f: UF be twice differenti- 
able, and such that D*f is continuous. Then for each x € U, the bilinear 


map D7f(x) is symmetric, that is 


D?f(x)(v, w) = D*f(x)(w, v) 


for all v, we E. 
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Proof. Let xeU and suppose that the open ball of radius r in E 
centered at x is contained in U. Let v, we E have lengths < r/2. Let 


g(x) = f(x + v) — f(x). 
Then 


f(x +v+w)— f(x + w) — f(x + v) + f(x) 


= g(x + w) — g(x) = | "gil + tw)w dt 
0 
= |. [Df(x + v + tw) — Df(x + tw)]w dt 
0 


1 1 
= | | D*f(x + sv + tw)v ds-w dt. 
o Jo 


Let 
(sv, tw) = D7f(x + sv + tw) — D?f(x). 
Then 


g(x + w) — g(x) = |. |. D*f(x)(v, w) ds dt 
o JO 


+ |. |. W (sv, tw)v-w ds dt 
0 JO 
= D’f(x)(v, w) + ev, w) 


where g(v,w) is the second integral on the right, and satisfies the 
estimate 


le(v, w)| S sup |p (sv, tw)| |v] |w]. 
S,t 


The sup is taken forO <s <1 andO<t<1. If we had started with 


gi (x) = f(x + w) — f(x) 


and considered g,(x + v) — g,(x), we would have found another expres- 
sion for the expression 


f(x +v0 + w) — f(x + w) — f(x + v) + f(x), 
namely 
Df (x)(w, v) + 91 (v, w) 
where 


|p, (v, w)| S sup |W, (sv, tw)| |v] |wI. 
S,t 
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But then 
D?f(x)(w, v) — D*f(x)(v, w) = ev, w) — @,(v, w). 


By the lemma, and the continuity of D*f which guarantees that 


sup|(sv,tw)| and = sup|W, (sv, tw)| 
s,t s,t 


satisfy the limit condition of the lemma, we now conclude that 


D?f(x)(w, v) = D*f(x)(v, w), 


as was to be shown. 

For an application of the second derivative, cf. the Morse—Palais 
lemma in Chapter XIII. It describes the behavior of a function in a 
neighborhood of a critical point in a manner used for instance in the 
calculus of variations. 


Xill, §6. HIGHER DERIVATIVES AND 
TAYLOR’S FORMULA 


We may now consider higher derivatives. We define 


D? f(x) = D(D?*f)(x). 
Thus D?f(x) is an element of L(E, L(E,...,L(E, F)...)) which we denote 
by L?(E, F). We say that f is of class C? on U or is a C? map if D*f (x) 
exists for each x € U, and if 
D*f: U > L*(E, F) 


is continuous for each k = 0, ...,p. 


We have trivially D?D'f(x) = D?f(x) if q+r=p and if D?f(x) exists. 
Also the p-th derivative D?” is linear in the sense that 


D’(f +g) = Df + D’g and D*(cf) = cD? f. 
If Ae L?(E, F) we write 
A(v1)(Vz)*** (Vp) = A(Vy, «-- Up). 
If q+r=p, we can evaluate J(v,,...,v,) in two steps, namely 


A(Vq 5 +++ 50g) * Ugtis «++ Up): 
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We regard A(v,,...,v,) as the element of L?~4(E, F) given by 


AV 4+ ++5Vq)* (Ugtis «++ Up) = A(V,, «.- Vp). 


Lemma 6.1. Let v, ...,v, be fixed elements of E. Assume that f is p 
times differentiable on U. Let 


g(x) = D? *f(x)(v2, ..-5Up). 
Then g is differentiable on U and 
Dg(x)(v) = D?f(x)(v, v2, ...,0p). 
Proof. The map g: U > F is a composite of the maps 
D?"f:U > L?\(E, F) — and A: L?-'(E, F) > F, 


where / is given by the evaluation at (v2,...,v,). Thus 4 is continuous 
and linear. It is an old theorem that 


D(Ao D? *f) =10 DD? 'f =1o DPf, 
namely the corollary of Theorem 3.1. Thus 
Dg(x)v = (D?f(x)v) (v2, ---,%p), 
which is precisely what we wanted to prove. 


Theorem 6.2. Let f be of class C? on U. Then for each x € U the map 
D? f(x) is multilinear symmetric. 


Proof. By induction on p22. For p=2 this is Theorem 5.3. In 
particular, if we let g = D’~*f we know that for v,, v, € E, 


D*g(x)(v,, 02) = D*g(x)(v2, v1), 
and since D’f = D?D”~?f we conclude that 
(*) D? f(x) (v4 gece cy) — (D?D?-7f(x))(v, 9 V2) ° (v3, see »Vp) 


- (D?D?~7f(x))(v2, v1) ; (v3, vee Vp) 


= D?f(x)(v2, 01, V3, .-+ Up). 


Let o be a permutation of (2,...,p). By induction, 


D?“!f(x)(Dg(2)5 «««sVo(p)) = D?*f (x) (0g, «+ s0p)- 
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By the lemma, we conclude that 


(+) DPf(x)(v, ) V6(2)> see Very) = D?f(x)(v, gene ,U,). 
From (*) and (**) we conclude that D?’f(x) is symmetric because any 
permutation of (1,...,p) can be expressed as a composition of the permu- 


tations considered in (*) or (**). This proves the theorem. 


For the higher derivatives, we have similar statements to those ob- 
tained with the first derivative in relation to linear maps. Observe that if 
w € L?(E, F) is a multilinear map, and 4 € L(F, G) is linear, we may com- 
pose these 

Ex::x ESF4G 
to get A4°q@, which is a multilinear map of E x--- x E-G. Further- 


more, w and A being continuous, it is clear that 1° q@ is also continuous. 
Finally, the map 


A,: L?(E, F) > L?(E, G) 
given by “composition with 4”, namely 
Mr>Aoa, 


is immediately verified to be a continuous linear map, that is for o,, 
w,é L?(E, F) and ce R we have 


A°0(@,+@,)=A00,+4°0, and A°(c@,)=choay,, 
and for the continuity, 
JA 0 @(v,,..-,0)| SAI lol loil--- |e, 
SO 


|Aca| S |All. 


Theorem 6.3. Let f: U > F be p times differentiable and let 14: F + G be 
a continuous linear map. Then for every x € U we have 


D?(A 0 f)(x) = 40 DP f(x). 
Proof. Consider the map xt» D?~!(A4 0 f)(x). By induction, 
D?“*(Lo f)(x) = 10 D? f(x). 
By the Corollary 3.2 concerning the derivative 


D(d, © D?“'f), 
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namely the derivative of the composite map 


DP-'f 


U = LP (E, F) > L?(E, G), 
we get the assertion of our theorem. 


If one wishes to omit the x from the notation in Theorem 6.3, then 
one must write 


DP(2.0 f) = dy ° Df 
Occasionally, one omits the lower * and writes simply D?(A o f) = 40 DP?f. 


Taylor’s Formula. Let U be open in E and let f: U > F be of class C?. 
Let x €U and let ye E be such that the segment x + ty, OSt <1, is 
contained in U. Denote by y™ the k-tuple (y, y,...,y). Then 


Dfx)y .  , DPV FRYP™ |p 


f(x + y) = f(x) + i (p — 1)! 


where 


1 (1 _ t)P-! 
= ______ J)P (p) . 


Proof. We can give two proofs, the first by integration by parts as 
usual, starting with the mean value theorem, 


1 


f(x + y) = f(x) + | Df(x + ty)y dt. 


0 


We consider the map tt»Df(x + ty)y of the interval into F, and the 
usual product 


Rx F-F, 
which consists in multiplying vectors of F by numbers. We let 
u = Df(x + ty)y, v= —(1 — 0), and dv = dt. 
This gives the next term, and then we proceed by induction, letting 


_a—1Pt 


=o pr 


u=D?f(ix+ty)y”? and dv 


at the p-th state. Integration by parts yields the next term of Taylor’s 
formula, plus the next remainder term. 
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The other proof can be given by using the Hahn—Banach theorem and 
applying a continuous linear function to the formula. This reduces the 
proof to the ordinary case of functions of one variable, that is with 
values in R. Of course, in that case, we also proceed by induction, so 
there is really not much to choose from between the two proofs. 

The remainder term R, can also be written in the form 


_ ‘(1 — 1 Pp (p) 
R, | (p— 1! D?f(x + ty) dt-y 


The mapping 
14 — t)P7 
ye | =) f(x + ty) dt 
o (p—1)! 


is continuous. If f is infinitely differentiable, then this mapping is 
infinitely differentiable since we shall see later that one can differentiate 
under the integral sign. 


Estimate of the Remainder. With notation as in Taylor’s formula, we 
can also write 


flo + 9) = flo) + AY sg PTO 5 ayy 
where 
0) < i MED 
and 


Proof. We write 


D?f(x + ty) — D?f(x) = (ty). 


Since D’f is continuous, it is bounded in some ball containing x, and 
lim W(ty) = 
y-0 


uniformly in t. On the other hand, the remainder R, given above can be 
written as 


1d — 17! 
o (p— 1)! 


(a 


(DI ——_-w(ty)y dt. 


Df (x) y') dt + |. 


We integrate the first integral to obtain the desired p-th term, and esti- 
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mate the second integral by 


t)P- 1 


“sup WWIIIy [oS U= a vO at, 


where we can again perform the integration to get the estimate for the 
error term @(y). 


Theorem 6.4. Let U be open in E and let f: U->F, x:::'x F, bea 
map with coordinate maps (f,,...,f,). Then f is of class C? if and only 
if each f, is of class C”, and if that is the case, then 


D?f = (D°f,,...,D?fin)- 


Proof. We proved this for p = 1 in §3, and the general case follows by 
induction. 


Theorem 6.5. Let U be open in E and V open in F. Let f: U->V and 
g: VG be C? maps. The go f is of class C?. 


Proof. We have 


D(go f)(x) = Dg(f(x)) © Df). 


Thus D(go/f) is obtained by composing a lot of maps, namely as 
represented in the following diagram: 


vy —“. L(F,G) 

f x — L(E, G) 
If p = 1, then all mappings occurring on the right are continuous and so 
D(g° f) is continuous. By induction, Dg and Df are of class C?~', and 
all the maps used to obtain D(go f) are of class C?’~' (the last one on 
the right is a composition of linear maps, and is continuous bilinear, so 
infinitely differentiable by Theorem 5.1). Hence D(go f) is of class C?™*, 
whence go f is of class C?, as was to be shown. 


XII, §7. PARTIAL DERIVATIVES 


Consider a product E = E, x --: x E, of complete normed vector spaces. 
Let U, be open in E; and let 


f:U,x-+: x U,7~F 
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be a map. We write an element xe U, x -:: x U, in terms of its “coordi- 
1 n 


nates”, namely x = (x,,...,X,) with x; € U,. 

We can form partial derivatives just as in the simple case when E = 
R”. Indeed, for x,, ...,X;-1, X41, ---,X, fixed, we consider the partial 
map 


XH f (Kp 5-02 Xiy ee Xp) 


of U; into F. If this map is differentiable, we call its derivative the partial 
derivative of f and denote it by D, f(x) at the point x. Thus, if it exists, 


D, f(x) = 4: E; > F 
is the unique continuous linear map Ae L(E,, F) such that 
F(X1,---5X%; +h, ...,X,) — f(X,,-.-,X,) = Ah) + o(h), 
for he E; and small enough that the left-hand side is defined. 
Theorem 7.1. Let U; be open in E; (i= 1,...,n) and let 
f:U,x-++: x U,-F 
be a map. This map is of class C? if and only if each partial derivative 
D,f:U, x ++: x U,-> L(E,, F) 
is of class C?~'. If this is the case, and 


v=(v,,...,0,)€ E, x: xX E 


ns) 


then 


Df(x)v = y D; f (x)0;. 


Proof. We shall give the proof just for n= 2, to save space. We 
assume that the partial derivatives are continuous, and want to prove 
that the derivative of f exists and is given by the formula of the theorem. 
We let (x, y) be the point at which we compute the derivative, and let 
h =(h,,h,). We have 


f(x +h,,y+h,) — f(x, y) 
= f(x +hy,y +h.) — f(x + hy, y) + f(x + hy, y) — f(x, y) 


1 


1 
= | D, f(x + hy, y + thy)h, dt + | D, f(x + th,, y)h, dt. 


0 0 
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Since D, f is continuous, the map w given by 


wW(h,, th,) = D, f(x + h,, y + th) — D2 f(x, y) 


satisfies 
lim Wh, 5 th,) = 0. 
h-0 

Thus we can write the first integral as 


1 


1 1 
| D, f(x + hy, y + th,)h dt = | D, f(x, y)hz dt + | W(h,, th,)h, dt 
0 


0 0 


1 
= D, f(x, y)h. + | W(h,, thz)h, dt. 


Estimating the error term given by this last integral, we find 


1 
| W(h,, th,)h, dt} S sup | (/y th)! Ih 
0 


O<t< 


|h| sup |W(h,, thz)| 
= o(h). 


IIA 


Similarly, the second integral yields 
D, f(x, y)hy + o(h). 


Adding these terms, we find that Df(x, y) exists and is given by the 
formula, which also shows that the map Df = f' is continuous, so f is of 
class C'. If each partial is of class C”, then it is clear that f is C?. We 
leave the converse to the reader. 


It will be useful to have a notation for linear maps of products into 
products. We treat the special case of two factors. We wish to describe 
linear maps 

AE, x E,->F, x FB. 


We contend that such a linear map can be represented by a matrix 


(7 2) 
Ari Ar2 
where each 4,,;E;— F, is itself a linear map. We thus take matrices 


whose components are not numbers any more but are themselves linear 
maps. This is done as follows. 
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Suppose we are given four linear maps 4, as above. An element of 
E, x E, may be viewed as a pair of elements (v,,v,) with v,¢E, and 
v,€E,. We now write such a pair as a column vector 


and define J(v,, v,) to be 


(7 )() _ a + ft) 

Az, 22] \02 Az1V, + 42202 

so that we multiply just as we would with numbers. Then it is clear that 
Ais a linear map of E, x E, into F, x F,. 


Conversely, let 14: E, x E,-—F, x F, be a linear map. We write an 
element (v,,v,)¢E, x E, in the form 


(v, ? V2) = (v, ’ 0) + (0, V>). 


We also write A in terms of its coordinate maps 4 =(A,,4,) where 
A,: E, x E, > F, and 4,: E, x E,—- FE, are linear. Then 


A(v1, V2) = (A,(, , V2), A2(V;1, v>)) 


= A,(, , 0) + A, (0, v,), A2(v,, 0) + A,(0, V>)). 
The map 
Vv, +> A,(0,, 0) 


is a linear map of E, into F, which we call 1,,. Similarly, we let 


A1i(¥,) = A, (v,, 9), A12(v2) = 4,(0, v2), 
Az1(01) = A2(v;, 9), Az2(V2) = A2(0, v2). 


Then we can represent 4 as the matrix 


(7 m2) 
An, Ar2 
as explained in the preceding discussion, and we see that A(v,,v,) is 
given by the multiplication of the above matrix with the vertical vector 
formed with v, and v,. 

Finally, we observe that if all 4, are continuous, then the map 1 is 
also continuous, and conversely. 


We can apply this to the case of partial derivatives, and we formulate 
the result as a corollary. 
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Corollary 7.2. Let U be open in E, x E, and let f: U->F, x F, be a 
C? map. Let f =(/,, f,) be represented by its coordinate maps 


fi UF, and f,.U oF,. 
Then for any x € U, the linear map Df(x) is represented by the matrix 


(ne one) 
D, fa(x) Dz f2(X) | 


Proof. This follows by applying Theorem 7.1 to each one of the maps 
f, and f,, and using the definitions of the preceding discussion. 


Observe that except for the fact that we deal with linear maps, all that 
precedes is treated just like the standard way for functions on open sets 
of n-space, where the derivatives follow exactly the same formalism with 
respect to the partial derivatives. 


Theorem 7.3. Let U be open in E, x E, and let f: U—-F be a map 
such that D, f, D,f, D,D,f, and D,D, f exist and are continuous. Then 
D,D,f = DzD, f. 


Proof. The proof is entirely analogous to the standard proof of the 
similar result for functions of two variables, and will be left to the reader. 
Actually, if we assume that f is of class C’, then the second derivative 
D*f(x) is represented by the matrix (D;D, f(x)), with i, j =1, 2. By Theo- 
rem 5.3, we know that D?f(x) is symmetric, whence we conclude that 
D, Dz f(x) = DD, f(x). 


Xill, §8. DIFFERENTIATING UNDER THE INTEGRAL SIGN 


Theorem 8.1. Let U be open in E and let J = [a,b] be an interval. Let 
f: J x U-F be a continuous map such that D,f exists and is continu- 
ous. Let 


b 
g(x) = | F(t, x) dt. 


Then g is differentiable on U and 
b 
Dg(x) = | D, f(t, x) dt. 


Proof. Differentiability is a property relating to a point, so let xe U. 
Selecting a sufficiently small open neighborhood V of x, we can assume 
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that D, f is bounded on J x V. Let J be the linear map 


A= [ D, f(t, x) dt. 


a 


We investigate 
g(x + h) — g(x) —Ah= | [ f(t, x + h) — f(t, x) — D, f(t, x)h] dt 


= [| | paste.x + uh du — Da fte,29h | 
=| | [D, f(t, x + uh) — D, f(t, x) jh au} dt. 
We estimate: 


|g(x + h) — g(x) — Ah| S max |D, f(t, x + uh) — D, f(t, x)| |hI, 


the maximum being taken for OS u<1 and 0OSt<1. By the relative 
uniform continuity of D,f with respect to the compact set J x {x}, we 
conclude that given « there exists 6 such that whenever |h| < 6 then this 
maximum is < «. This proves that A is the derivative g’(x), as desired. 


XI, §9. DIFFERENTIATION OF SEQUENCES 


Theorem 9.1. Let U be an open subset of a Banach space E, and let 
{ f,} be a sequence of C* maps of U into a Banach space F. Assume 
that { f,} converges pointwise to a map f, and also that the sequence of 
derivatives { f,,' converges uniformly, to a mapping 


g: U > L(E, F). 
Then f is differentiable, and f' = g. 
Proof. Let x »¢U. Differentiability being a local property, we can 


assume without loss of generality that U is an open ball cetnered at x. 
For x € U, we have by the mean value theorem applied to f, — f,,: 


fulx) — Sn(x) — fil%o) — fn(%o))| 1x — Xo sup fn (Y) — Sm )I- 
ye 
Given « there exists N such that if m, n > N, then 


ifi—-fill<e and |fi-gll<e. 
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Letting m tend to infinity, we conclude that for n > N we have 


(1) \fu(x) — f(x) — (Sno) — f%o))| S |x — xole. 


Fix n> N. Again by the mean value theorem, there exists 6 such that if 
|x — X9| < 6 we have 


(2) \fnlx) — falXo) — fa (Xo) (x — Xo)| S |x — Xl 
Finally, use the fact that || — g|| < «. We conclude from (1) and (2) that 
| f(x) — f(X0) — g(Xo)(x — Xo)| S 3x — Xole. 


This proves our theorem. 


XIII, §10. EXERCISES 


1. Let U be open in E and V pen in F. Let 
f: UV and g.V>G 


be of class C”. Let x»¢U. Assume that D*f(x.)=0 for all k=0, ...,p. 
Show that D*(go f)(x,)=0 for OS kp. [Hint: Induction.] Also prove 
that if D¥g( f(x,)) = 0 for O< k < p, then (D*(g 0 f))(xp) = 0 forO Sk <p. 


2. Let f(t)=).c,t" be a power series with real coefficients, converging in a 
circle of radius r. Let A be a Banach algebra. Show that the map 


ur) c,u" 
is a C! (or even C”) map on the disc of radius r centered at the origin in A. 
p g 


3. Let E, F be Banach spaces, and Lis(E, F) the set of toplinear isomorphisms 
between E and F. Show that the map ur>u™ from Lis(E, F) to Lis(F, E) is 
differentiable, and find its derivative (as in the case of Banach algebras). 


4. Let A be a Banach algebra with unit e. Show that one can define a square 
root function in a neighborhood of e, in such a manner that it is of class C! 
(or even C%). 


5. Let Z be a compact topological space, E a Banach space, and F = C°(Z, E) 
the Banach space of continuous maps of Z into E, with the sup norm. Let U 
be open in E, and let V be the subset of F consisting of all maps f: Z > U 
which map Z into U, so V=C°%(Z, U). Let g: UG be a map of U into a 
Banach space G. 

(a) If g is continuous, show that the map 


frgof 


of V into C°(Z, G) is continuous. 
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(b) If g is of class C!, show that the above map is of class C', and find a 
formula for its derivative. 
(c) If g is of class C’, show that the above map is of class C?. 


6. Let J = [a,b] be a closed interval, and let U be open in a Banach space E. 
Let g: U->G be a C! map. Let C°(J, U) be the set of continuous maps of J 
into U. Show that C°(J, U) is open in C°(J, E), and that the map 


| goa =S,(«) 


is of class C'!. The notation means that 


S,(a)(t) = | g(a(u)) du. 


a 


Find an expression for the derivative of S,. 


7. Let f be a map of class C' on a Banach space E such that f(tx) = tf(x) for 
all real t and all x e E. Show that f is linear, and in fact that f(x) = f’(0)x. 


8. Let f be a map of class C? on a Banach space E such that f(tx) = t?f(x) for 
all real t and all xe E. Show that f is quadratic, and that in fact 


f(x) = D*f(O)(x, x). 


Generalize Exercises 7 and 8. 
9. Let E be a Banach space, and J =[a,b] a closed interval. For each C’ 
curve a: J > E let the C! norm of « be defined by 
oly = [acl] + Yor] 
where «’ is the derivative of « Show that this is a norm, and that the space 
C'(J, E) of C! curves is complete under this norm. 


10. Let U be open in a Banach space and let BC?(U, F) be the space of maps 
f:U-—F into a Banach space F which are of class C?, and such that all 
derivatives D*f are bounded, for k=0, ...,p. Show that BC*(U, F) is a 
Banach space, under the norm 


If llcp = sup ||D*f Il, 


the sup being taken for OS k S p. 


11. This exercise is a starting point for the calculus of variations. Let E be a 
Banach space and U an open subset of R x E x E. Let 


H:U-R 


be a C? function. Let J be a closed interval [a,b]. Let V be the subset of 
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C'(J, E) consisting of all curves « € C'(J, E) such that the curve 


tr(t, a(t), a(t),  teJ, 


lies in U. 
(a) Show that V is open in C'(J, E). 
(b) Show that the map 


b> [ H(t, a(t), «’(t)) dt = f(a) 


is a C? function on V, and determine its derivative. 
(c) Let g: J > L(E, R) be a continuous function such that 


b 

| g(t)o(t) dt = 0 

for every C! curve o: J > E having the property that 
a(a) = a(b) = 0. 


Show that g = 0. 
(d) Let C'(J, E, 0) be the subset of curves a in C'(J, E) such that 


a(a) = a(b) = 0. 


Show that C'(J, E, 0) is a closed subspace. Restrict the function f to the 
open set 


Vy =VAC(J, E, 0) 


of this closed subspace. Show that if an element ae \% is a local mini- 
mum of f in , then 


D, D3 H(t, a(t), '(t)) = D, H(t, a(t), «'(t)) 


for all te J. [Hint: Show that if a function has a local minimum at a 
point, then its derivative is 0 at that point, and use (c).] 


CHAPTER XIV 


Inverse Mappings and 
Differential Equations 


XIV, §1. THE INVERSE MAPPING THEOREM 


Both the inverse mapping theorem and the existence theorem for differen- 
tial equations will be based on a basic and simple lemma in complete 
metric spaces. 


Lemma 1.1 (Shrinking Lemma). Let M be a complete metric space, and 
let T: M—M be amapping. Assume that there exists a number K with 
0 < K <1 such that for all x, ye M we have 


d(Tx, Ty) S Kd(x, y), 


where d is the distance function in M. Then T has a unique fixed point 
z, that is a point such that Tz = z. If x € M, then 


z= lim T"x. 


n->0o 


Proof. For simplicity of notation, we assume that M is a closed subset 
of a Banach space. We first observe that a fixed point z, if it exists, is 
unique because if z, is also fixed, then 


|z — z,) =|Tz — Tz,| S$ K|z —z,|, 


so z—z, =0. Now for existence, let m, n be positive integers and say 
n=m,n=m+4+r. Then for any x we have 


|\T"x —T™x| < K™|x — T’x| 
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and 


Ix — T’x| S |x — Tx| + |Tx — T?x| +°°-+|T"'x — T’x| 
<(1+K+-°°°-+ K"")|x — Tx|. 


This shows that the sequence {T"x} is Cauchy, converging to some 
element ze M. This element z is a fixed point because 


|\Tz — TT"x| S$ K|z — T’x| 


and for n sufficiently large, TT"x approaches z and also Tz. This proves 
the shrinking lemma. 


We shall call K in the lemma a shrinking constant for T. 

Let U be open in a Banach space E, and let f: UF be a C? map 
(p = 1). We shall say that f is a C?-isomorphism or is C?-invertible on U 
if the image f(U) is an open set V in F, and if there exists a C? map 


g:V—-U 


such that go f and fog are the identity maps on U and V respectively. 
We say that f is a local C?-isomorphism at a point x in U, or is locally 
C?-invertible at x, if there exists an open set U, contained in U and 
containing x such that the restriction of f to U, is C?-invertible on U,. 

It is clear that the composite of two C?-isomorphisms is again a 
C?-isomorphism, and that the composite of two locally C?-invertible 
maps is also locally C?-invertible. In other words, if f is locally C?- 
invertible at x, if f(x) is contained in some open set V, and if g: V >G is 
locally C?-invertible at f(x), then go f is locally C?-invertible at x. 

The inverse mapping theorem provides a criterion for a map to be 
locally C?-invertible, in terms of its derivative. 


Theorem 1.2 (Inverse Mapping Theorem). Let U be open in a Banach 
space E, and let f: U>F be a C? map. Let xy¢€U and assume that 
f'(Xo): E- F is a toplinear isomorphism (i.e. invertible as a continuous 
linear map). Then f is a local C?-isomorphism at xq. 


Proof. Let A=f'(xo). It suffices to prove that A“'of is locally 
invertible at x,» because we may consider A~' o f instead of f itself. Thus 
we have reduced our theorem to the case where E = F and f'(xq) is the 
identity mapping. Next, making translations, it suffices to prove our 
theorem when x, = 0 and f(x.) =0 also. From now on, we make these 
additional assumptions. 

Let g(x) =x — f(x). Then g'(0)=0 and by continuity there exists 
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r > 0 such that if |x| < 2r, then 
\g'(x)| S 2- 


From the mean value theorem we see that |g(x)| S +|x|, and hence that g 
maps the closed ball B,(0) into B,.(0). We contend that given y e€ B,,(0), 


there exists a unique element x € B(0) such that f(x) = y. We prove this 
by considering the map 


g(x) =y + x — f(x). 
If |y| Sr/2 and |x| Sr, then |g,(x)| Sr and hence g, may be viewed as a 
mapping of the complete metric space B.(0) into itself. The bound of 4 


on the derivative together with the mean value theorem shows that g, is 
a shrinking map, 1.e. that 


lJ y(X1) —_ gy(X2)| = |g(x,) — g(X2)| S 1X4 — X,| 


for x,, x, €B,(0). By the shrinking lemma, it follows that gy has a 
unique fixed point, which is precisely the solution of the equation 
f(x) = y. This proves our contention. 


We obtain a local inverse for f, which we denote by f~*. This inverse 
is continuous, because writing x = x — f(x) + f(x) we see that 


IX; — Xo] S1f%1) — f2) + lg) — g&%2)| 


S| f(x1) — f(x2)| + a1x1 — Xa, 
whence 


Ix, — X2| S2|f(x1) — f(x2)I. 
We shall now see that this inverse is differentiable on the open ball 


B,.(0). Indeed, fix y, €B,,(0) and let y, = f(x,) with x,¢B,(0). Let 
y € B,,(0), and let y = f(x) with x € B,(0). Then: 


(*) IFW -— FCO) — fe) — ya) 
=|x— x, — f'(x1)*(S(%) — f(x,))I. 


From the differentiability of f, we can write 


f(x) = f(x) + f(x) (% — x1) + o(x — x). 


If we substitute this in (*), we find the expression 


|f"(x1)* o(x — x4). 
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Of course, f’(x,)~* is bounded by a fixed constant and by what we have 
already seen, we have 


Ix —x,| S2ly — yy. 


From the definition of differentiability, we conclude that f~' is differenti- 
able at y, and that its derivative is given by 


f'(x:)" = ff toy): 


Since the mappings f~’, f’, “inverse” are continuous, it follows that 
D(f~') is continuous and thus that f~' is of class C’. Since taking 
inverses is C®, it follows inductively that f~' is C’, as was to be shown. 

We shall generalize part of the inverse mapping theorem in Chapter 
XV, §3. 

In some applications it is necessary to know that if the derivative of a 
map is close to the identity, then the image of a ball contains a ball of 
only slightly smaller radius. The precise statement follows. In this book, 
it will be used only in the proof of the change of variables formula, and 
therefore may be omitted until the reader needs it. 


Lemma 1.3. Let U be open in E, and let f: UE be of class c'. 
Assume that f(0)=0, f’(0) =I. Let r>0 and assume that BO) c U. 
Let 0 < s <1, and assume that 


IPZ-fOolss 


for all x, ze B(0). If yeE and |y|<(1—s)r, then there exists a 
unique x € B(0) such that f(x) = y. 


Proof. The map g, given by g,(x) = x — f(x) + y is defined for |x| Sr 
and |y| <(1 — s)r, and maps B,(0) into itself because, from the estimate 


F(x) — x| = 1f0) — F0) — f'O)x] S |x| sup |f"(2) — #0) 


ST, 


IA 


we obtain 
lgy(x)| S sr + (1 —s)r=r. 


Furthermore, g, is a shrinking map because, from the mean value theo- 
rem, we get 
19 y(X1) — gy(X2)| = |X; —X2- (f(x1) —_ f(x2))| 
= |X; — Xz — f’(O)(x1 — Xz) + 0(X4, X2)| 
= |0(X,, X2)| 
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where 
|5(x1, X2)| S |x, — x2| sup|f’(z) — f’(0)| 


< s|x, — xy|. 


Hence g, has a unique fixed point xe B.(0) which is such that f(x) = y. 
This proves the lemma. 


XIV, §2. THE IMPLICIT MAPPING THEOREM 
Its statement is as follows. 


Theorem 2.1. Let U, V be open sets in Banach spaces E, F respectively, 
and let 
f:UxV-G 


be a C” mapping. Let (a, b)¢U x V, and assume that 
D, f(a, b): F > G 


is a toplinear isomorphism. Let f(a, b) = 0. Then there exists a continu- 
ous map g: U) > V defined on an open neighborhood Up, of a such that 
g(a) = b and such that 


f(x, g(x)) = 0 


for all x EU). If Up is taken to be a sufficiently small ball, then g is 
uniquely determined, and is also of class C?. 


Proof. Let 4 = D, f(a, b). Replacing f by 1~* o f we may assume with- 
out loss of generality that D, f(a, b) is the identity. Consider the map 


g:UxV>ExG 
given by 
p(x, y) = (x, f(x y)). 


Then the derivative of @ at (a, b) is immediately computed to be repre- 
sented by the matrix 


- I; O _ Ir O 
Dea, b) = Gi b) D,f(a, ») 7 Cra b) °) 


whence ¢ is locally invertible at (a, b) since the inverse of De(a, b) exists 


( ) 
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We denote the local inverse of o by w. We can write 
W(x, z) = (x, h(x, z)) 
where h is some mapping of class C?. We define 
g(x) = h(x, 0). 
Then certainly g is of class C? and 


(x, f(x, g(x))) = e(x, g(x) = e(x, h(x, 0)) 
= (w(x, 0)) = (x, 0). 


This proves the existence of a C? map g satisfying our requirements. 

Now for the uniqueness, suppose that g, is a continuous map defined 
near a such that g)(a)=b and f(x, go(x)) =0 for all x near a. Then 
Jo(x) is near b for such x, and hence 


(x, go(x)) = (x, 0). 


Since @ is invertible near (a,b) it follows that there is a unique point 
(x, y) near (a,b) such that (x, y) = (x, 0). Let U, be a small ball on 
which g is defined. If gp is also defined on Up, then the above argument 
shows that g and g, coincide on some smaller neighborhood of a. Let 
xéU, and let v=x-—a. Consider the set of those numbers ¢t with 
QO<t<1 such that g(a+ tv) =g,.(a+ tv). This set is not empty. Let s 
be its least upper bound. By continuity, we have g(a + sv) = go(a + Sv). 
If s< 1, we can apply the existence and that part of the uniqueness just 
proved to show that g and gp, are in fact equal in a neighborhood of 
a+ sv. Hence s = 1, and our uniqueness statement is proved, as well as 
the theorem. 


Note. The particular value f(a,b)=0O in the preceding theorem is 
irrelevant. If f(a,b)=c for some c #0, then the above proof goes 
through replacing 0 by c everywhere. 


XIV, §3. EXISTENCE THEOREM FOR 
DIFFERENTIAL EQUATIONS 


Let E be a Banach space and U an open set in E. By a vector field on 
U we simply mean a mapping f: U- E, which we interpret as assign- 
ing a vector to each point of U. We shall assume our vector field is C? 
with p= 1. Let x) be a point of U. An integral curve for f with initial 
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condition x, is a mapping of class C’ (r 2 1) 
a: J —~U 


defined on an open interval J containing 0, such that «(0) = x9 and such 
that 


a'(t) = f(a(2)). 


We visualize this as saying that the velocity (tangent) vector of the curve 
a at a point is equal to the vector associated to that point by the vector 
field. We observe that an integral curve can also be viewed as a solution 
of the integral equation 


a(t)=XxX_ + | f(a(s)) ds. 
0 


Namely, any solution of this integral equation is obviously an integral 
curve of f with the specified initial condition, and conversely, such 
an integral curve satisfies the integral equation. Furthermore, we ob- 
serve that an integral curve for f is then necessarily of class C’**, by 
induction. 

We shall say that f satisfies a Lipschitz condition on U if there exists a 
number K > 0 such that 


| f(x) — f(y)| S K|x — y| 


for all x, ye U. We then call K a Lipschitz constant for f. If f is of 
class C', it follows at once from the mean value theorem that f is locally 
Lipschitz, that is Lipschitz in the neighborhood of every point, and that 
it is bounded on such a neighborhood. 

Let f: UE be a vector field and x,¢U. By a local flow at x9 we 
mean a mapping 


a: J x UU 


where J is an open interval containing 0, and U) is an open subset of U 
containing x,, such that for each x in Up the map 


tr>a,(t) = a(t, x) 


is an integral curve for f with initial condition x (namely (0, x) = x). 
We define a local flow with the eventual intent to analyze its dependence 
on x. However, for this section, the occurrence of x is still incidental, 
and is introduced only to get some uniformity results. We shall prove 
that a local flow always exists if f satisfies a Lipschitz condition. 
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Theorem 3.1. Let f: U—-E be a vector field satisfying a Lipschitz 
condition with constant K >0. Let x»eU. Let 0<a< 1, assume that 
the closed ball B,,(x >) is contained in U, and that f is bounded by a 
constant L >0 on this ball. If b is a number > 0 such that b <a/L 
and b < 1/K, then there exists a unique local flow 


a: J, x B(x9) 7 U 


where J, is the open interval —b <t <b, and B,(xQ,) is the open ball of 
radius a centered at Xo. 


Proof. Let I, be the closed interval —b St Sb and let x be a point 
in B,(x.). Let M be the set of continuous maps 


a: I, By 4(Xo) 


of the closed interval into the closed ball of center x, and radius 2a, such 
that «(0) = x. We view M as a subset of the space of continuous maps 
of I, into E, with the sup norm. Then M is complete. For each a in M 
we define the curve Sa by 


Sa(t)=x + | f(a(u)) du. 


Then Sa is certainly continuous, and Sa(0)=x. The distance of any 
point of Sa from x is bounded by the norm of the integral, and we have 
the estimate 

|Sa(t) — x| S bL <a. 


Hence Sa lies in M, so S maps M into itself. Furthermore, S is a 
shrinking map, because for a, 6 in M we have 


|S — SB] <b sup| f(a(u)) — f(Bu))| 
< bK||a — Bl. 


We can now apply the shrinking lemma to conclude the proof of our 
theorem. 


If we fix the initial condition x, then each integral curve «, is of 
course differentiable. However, we shall be interested in the dependence 
on x, and it is already easy to show continuity. 


Corollary 3.2. The local flow « in Theorem 3.1 is continuous. Further- 
more, the map xa, of B,(X9) into the space of curves is continuous, 
and in fact satisfies a Lipschitz condition. 
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_ Proof. The second statement obviously implies the first. So fix x in 
B,(x,) and take y close to x in B,(x,). We let S, be the shrinking map of 
the theorem, corresponding to the initial condition x. Then 


Ilo. ~~ S,0,,|| = || S05 ~ S,a,,|| < |x _ y\. 
Let C=bK so0<C<1. Then 


n 2 wae n-1 n 
Ilo. ~~ Sy 0. || < ||. ~ S,a,.|| + || S,,,. ~~ S5 0, || + + || Sy a, — S; at. || 


<(1+C+-:°-+C""')|x— yl. 


Since the limit of S?a, is equal to a, as n goes to infinity, the continuity 
of the map xa, follows at once. In fact, the map satisfies a Lipschitz 
condition as stated. 


It is easy to formulate a uniqueness theorem for integral curves over 
their whole domain of definition. 


Theorem 3.3. Let U be open in E and let f: U-E be a vector field of 
class C’, p21. Let 


a,2J,7U and a,:J,+U 


be two integral curves for f with the same initial condition xy. Then a, 
and «, are equal on J, NJ). 


Proof. Let Q be the set of numbers b such that «,(t) = «,(t) for 
0<t<b. Then Q contains some number b > 0 by the local uniqueness 
theorem. If Q is not bounded from above, the equality of a,(t) and «,(t) 
for all t>0 follows at once. If Q is bounded from above, let b be its 
least upper bound. We must show that b is the right end point of 
J, A J,. Suppose that this is not the case. Define curves B, and f, near 
0 by 


Bi(j)=a,(b+t) and B,(t)=a,(b + 0). 


Then f£, and fB, are integral curves of f with the initial conditions «,(b) 
and «a,(b) respectively. The values f,(t) and f,(t) are equal for small 
negative t because b is the least upper bound of Q. By continuity it 
follows that «,(b) = «,(b), and finally we see from the local uniqueness 
theorem that 


By (t) = B(2) 


for all t in some neighborhood of 0, whence a, and «, are equal in a 
neighborhood of b, contradicting the fact that b is a least upper bound 
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of Q. We can argue the same way towards the left end points, and thus 
prove our theorem. 

For each x € U, let J(x) be the union of all open intervals containing 0 
on which integral curves for f are defined, with initial condition equal to 
x. Then Theorem 3.3 allows us to define the integral curve uniquely on 
all of J(x). 


Remark. The choice of 0 as the initial time value is made for conve- 
nience. From Theorem 3.3 one obtains at once (making a time transla- 
tion) the analogous statement for an integral curve defined on any open 
interval; in other words, if J,, J, do not necessarily contain 0, and f is a 
point in J, 7 J, such that a,(t 9) = «,(t>), and also we have the differential 
equations 


a'(t)=f(a,(t)) and aQ(t) = f(a,(2)), 


then «, and «, are equal on J, \J,. One can also repeat the proof of 
Theorem 3.3 in this case. 


In practice, one meets vector fields which may be time dependent, and 
also depend on parameters. We discuss these to show that their study 
reduces to the study of the standard case. 


Time-Dependent Vector Fields 
Let J be an open interval, U open in a Banach space E, and 
f:J x U-E 


a C? map, which we view as depending on time te J. Thus for each t, 
the map x?» f(t, x) is a vector field on U. Define 


f:Jx U>RXxE 
by 
f(t, x) =(L f(t »)) 


and view f as a time-independent vector field on J x U. Let &% be its 
flow, so that 


D, u(t, s, x) = f(&(t, s, x)), a(0, s, x) = (s, x). 


We note that « has its values in J x U and thus can be expressed in 
terms of two components. In fact, it follows at once that we can write « 
in the form 


a(t, s, x) = (t + S, X(t, S, x)). 
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Then a, satisfies the differential equation 
D,%,(t, s, x) = f(t + s, %(t, s, x)) 
as we see from the definition of f. Let 
B(t, x) = a(t, 0, x). 


Then f is a flow for f, i.e. satisfies the differential equation 


D, B(t, x) = f(t, Bit, x)), BO, x) = x. 


Given x € U, any value of t such that « is defined at (t,x) is also such 
that a is defined at (t,0, x) because a, and f, are integral curves of the 
same vector field, with the same initial condition, hence are equal. Thus 
the study of time-dependent vector fields is reduced to the study of 
time-independent ones. 


Dependence on Parameters 
Let V be open in some space F and let 
giJxVxU-E 


be a map which we view as a time-dependent vector field on U, also 
depending on parameters in V. We define 


G:JxVxU-FxE 
by 
G(t, z, y) = (0, g(t, z, y)) 


for te J, ze V, and ye U. This is now a time-dependent vector field on 
V x U. A local flow for G depends on three variables, say P(t, z, y), with 
initial condition B(0, z, y) = (z, y).. The map f has two components, and it 
is immediately clear that we can write 


B(t, Z; y) = (z, a(t, Z, y)) 


for some map a depending on three variables. Consequently « satisfies 
the differential equation 


D, a(t, z, y) = g(t, z, a(t, z,y)), «(0,2 y)=y, 


which gives the flow of our original vector field g depending on the 
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parameters zeéV. This procedure reduces the study of differential 
equations dependihg on parameters to those which are independent of 
parameters. 


XIV, §4. LOCAL DEPENDENCE ON INITIAL CONDITIONS 


We shall now see that the map xr» «, in fact depends differentiably on x. 
The proof, which depends on a very simple application of the implicit 
mapping theorem in Banach spaces, was found independently by Pugh 
and Robbin. 

Let U be open in E and let f: U-E be a C? map (which we call a 
vector field). Let b>0O and let J, be the closed interval of radius b 
centered at 0. Let 


F= C°(L,, E) 


be the Banach space of continuous maps of I, into E. We let V be the 
subset of F consisting of all continuous curves 


o:1,>U 


mapping J, into our open set U. Then it is clear that V is open in F 
because for each curve o the image oa(I,) is compact, hence at a finite 
distance from the complement of U, so that any curve close to it is also 
contained in U. 
We define a map 
T:U xV-F 


by 
Tixo)=x+ | foo -—o. 
0 


Here we omit the dummy variable of integration, and x stands for the 
constant curve with value x. If we evaluate the curve T(x, oc) at t, then 
by definition we have 


T(x, o)(t) =x + | f(a(u)) du — a(t). 
0 


Lemma 4.1. The map T is of class C”, and its second partial derivative 
is given by the formula 


D.T1x,0)= | Dfoa—I 
0 


372 INVERSE MAPPINGS AND DIFFERENTIAL EQUATIONS [XIV, §4] 


where I is the identity. In terms of t, this reads: 


D, T(x, o)h(t) = | Df(a(u))h(u) du — h(t). 


Proof. It is clear that the first partial derivative D,T exists and is 
continuous, in fact C”, being linear in x up to a translation. To deter- 
mine the second partial, we apply the definition of the derivative. The 
derivative of the map o+a is of course the identity. We have to get the 
derivative with respect to o of the integral expression. We have for small 


h: 
| fo+h—| foo-| (D oh 


s| Ife(a+h)—feoa—(Dfeco)hl. 


We estimate the expression inside the integral at each point u, with u 
between 0 and the upper variable of integration. From the mean value 
theorem, we get 


|f(o(u) + h(u)) — f(o(u)) — Df(a(u))h@)| 
< ||h|| sup |Df(z,) — Df(())| 


where the sup is taken over all points z, on the segment between oa(u) 
and o(u)+h(u). Since Df is continuous, and using the fact that the 
image of the curve o(I,) is compact, we conclude (as in the case of 
uniform continuity) that as ||h|| > 0, the expression 


sup |Df(z,) — Df(a(u))| 


also goes to 0. (Put the ¢ and 6 in yourself.) By definition, this gives us 
the derivative of the integral expression in o. The derivative of the final 
term is obviously the identity, so this proves that D,T is given by the 
formula which we wrote down. 

This derivative does not depend on x. It is continuous in o. Namely, 
we have 


D, T(x, t) — D, T(x, 6) = | [Df ot — Df oa}. 
0 


If o is fixed and t is close to o, then Df ot — Df oc is small, as one 
proves easily from the compactness of o(J,), as in the proof of uniform 
continuity. Thus D,T is continuous. By Theorem 7.1 of Chapter XIII 
we now conclude that T is of class C’. 
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The derivative of D,T with respect to o can again be computed as 
before if Df is itself of class C', and thus by induction, if f is of class C? 
we conclude that D,T is of class C?"* so that by Theorem 7.1 of Chap- 
ter XIII, we conclude that T itself is of class C?. This proves our lemma. 


We observe that a solution of the equation 
T(x, ¢) = 0 

is precisely an integral curve for the vector field, with initial condition 
equal to x. Thus we are in a situation where we want to apply the 
implicit mapping theorem. 

Lemma 4.2. Let x, ¢U. Let a>0 be such that Df is bounded, say by 

a number C, > 0, on the ball B,(x,9) (we can always find such a since Df 

is continuous at X,). Let b < 1/C,. Then D, T(x, oc) is invertible for all 


(x, o) in BX) x V. 


Proof. We have an estimate 


S bC, |[hI. 


| Df(a(u))h(u) du 
0 


This means that 
|D, T(x, 6) + I| < 1, 


and hence that D, T(x, o) is invertible, as a continuous linear map, thus 
proving Lemma 4.2. 


Theorem 4.3. Let p be a positive integer, and let f: U-E be a C? 
vector field. Let xy9¢€U. Then there exist numbers a, b> 0 such that 
the local flow 

a: J, x B(x9) 7 U 
is of class C?. 


Proof. We take a so small and then b so small that the local flow 
exists and is uniquely determined by Theorem 3.1. We then take b 
smaller and a smaller so as to satisfy the hypotheses of Lemma 4.2. We 
can then apply the implicit mapping theorem to conclude that the map 
xt+a, 1s of class C?. Of course, we have to consider the flow « and still 
must show that «@ itself is of class C?. It will suffice to prove that D,a 
and D,« are of class C?~', by Theorem 7.1 of Chapter XIII. We first 
consider the case p = 1. 

We could derive the continuity of « from Corollary 3.2 but we can 
also get it as an immediate consequence of the continuity of the map 
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xtea,. Indeed, fixing (s, y) we have 
|a(t, x) ~ a(s, yl < |a(t, x) ~ a(t, y)I + |a(t, y) ~ a(s, y)| 
s |, ~~ at, | + |a,,(t) ~ a,(s)|. 


Since a, is continuous (being differentiable), we get the continuity of «. 
Since 


D, a(t, x) = f(a(t, x)), 
we conclude that D,« is a composite of continuous maps, whence 
continuous. 
Let @ be the derivative of the map xt» «,,, so that 
op: B,(X9) > L(E, C°(I,, E)) = L(E, F) 
is of class C?~’. Then 
Oxtw — Cx = p(x)w + |wl yw) 
where /(w) > 0 as w-> 0. Evaluating at t, we find 
a(t, x + w) — a(t, x) = (p(x)w)(Q) + |wly(w)(O, 


and from this we see that 


D, a(t, x)w = (e(x)w)(t) 
Then 


|D, a(t, x)w — D,a(s, y)w| S |(e(x)w)(t) — (o(y)w) | 
+ |(p(y)w)(t) — (e(y)¥) (5). 


The first term on the right is bounded by 


|o(x) — e(y)||w| 
so that 


|D, a(t, x) — D2 a(t, y)| S |o(x) — oy): 
We shall prove below that 


I(e(y)w) (0) — (ew) (5)| 


is uniformly small with respect to w when s is close to t. This proves the 
continuity of D,«, and concludes the proof that « is of class C’. 
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The following proof that |(@(y)w)(t) — (@(y)w)(s)| is uniformly small 
was shown to be by Professor Yamanaka. We have 


(1) a(t,x)=x+ | f(a(u, x)) du. 
Replacing x with x + dw (we E, 4 40), we obtain 

(2) a(t,x + Aw) =x + dwt | f(a(u, x + Aw)) du. 
Therefore 


(3) a(t, x + “ — a(t, x) 
=w-+ | “LAlau x + Aw)) — f(a(u, x))] du. 
0 


On the other hand, we have already seen in the proof of Theorem 4.3 
that 


(4) a(t, x + Aw) — a(t, x) = A(p(x)w)(d) + [Al lwl Ww). 


Substituting (4) in (3), we obtain: 
A 
(oxyn)(o) + Fly amyee 
= wt | ALslatu x + A) ~ fale) de 
1) 


=we] | G(u, A, v) dv du, 
0 J0 
where 
G(u, A, v) = Df(a(u, x) + ve,(A)) ((p(x)w)(u) + €2(A)) 
with 


A 
ex(A) = H(olx)w)(w) + LAllwlWCw)(u, — en(d) = (awa. 
Letting 4 > 0, we have 


5) (o(x)w)(t) = w + | “ flatu, »)(oCe)w)(w) du 
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By (5) we have 


I(e(x)w)(t) — ((x)w)(s)| S 


t 
| Prtatu x) (060%) 0) a 
S bC,| (x) ‘wl lt — |, 
from which we immediately obtain the desired uniformity. 


We have 


a(t, x)= x + | f(a(u, x)) du. 


We can differentiate under the integral sign with respect to the parameter 
x and thus obtain 


D, a(t, x) =I + | Df(a(u, x))D«(u, x) du, 


where J is a constant linear map (the identity). Differentiating with re- 
spect to t yields the linear differential equation satisfied by D,«, namely 


D, D,a(t, x) = Df(a(t, x))D,«(t, x) 


and this differential equation depends on time and parameters. We have 
seen in §3 how such equations can be reduced to the ordinary case. We 
now conclude that locally, by induction, D,« is of class C?~' since Df is 
of class C?~*. Since 


D, a(t, x) = f(a(t, x), 


we conclude by induction that D,« is C?~*. Hence « is of class C? by 
Theorem 7.1 of Chapter 5. Note that each time we use induction, the 
domain of the flow may shrink. In the next section, we shall prove a 
more global result. In any case, we have proved Theorem 4.3. 


XIV, §5. GLOBAL SMOOTHNESS OF THE FLOW 


Let U be open in a Banach space E, and let f: UE be a C? vector 
field. We let J(x) be the domain of the integral curve with initial condi- 
tion equal to x. 

Let D(f) be the set of all points (t,x) in R x U such that ¢ lies in 
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J(x). Then we have a map 
a: D(f)- U 


defined on all of D(f), letting «(t, x) = «,(t) be the integral curve on J(x) 
having x as initial condition. We call this the flow determined by f, and 
we call D(f) its domain of definition. 


Theorem 5.1. Let f: U-E be a C? vector field on the open set U of 
E, and let « be its flow. Abbreviate a(t, x) by tx if (t,x) is in the 
domain of definition of the flow. Let x EU. If to lies in J(x), then 


J (tox) = J(x) — to 
(translation of J(x) by —to), and we have for all t in J(x) — to: 
t(tox) = (t + to)x. 
Proof. The two curves defined by 
troa(t,a(to,x)) and  trera(t + to, x) 


are integral curves of the same vector field, with the same initial condi- 
tion tpjx at t= 0. Hence they have the same domain of definition J(t)x). 
Hence t, lies in J(t yx) if and only if t; +t 9 lies in J(x). This proves 
the first assertion. The second assertion comes from the uniqueness of 
the integral curve having given initial condition, whence the theorem 
follows. 


Theorem 5.2. If f is of class C? (with p S 00), then its flow is of class 
C? on its domain of definition. 


Proof. First let p be an integer = 1. We know that the flow is locally 
of class C? at each point (0, x), by Theorem 4.3. Let x, € U and let J(x9) 
be the maximal interval of definition of the integral curve having x, as 
initial condition. Let D(f) be the domain of definition of the flow, and 
let a be the flow. Let Q be the set of numbers b > 0 such that for each t 
with 0 <t <b there exists an open interval J containing t and an open 
set V containing x, such that J x V is contained in D(f) and such that 
a is of class C? on J x V. Then Q is not empty by Theorem 4.3. If Q 1s 
not bounded from above, then we are done looking toward the right end 
point of J(x,). If Q is bounded from above, we let b be its least upper 
bound. We must prove that b is the right end point of J(x,.). Suppose 
that this is not the case. Then a(b, x.) is defined. Let x, = a(b, x9). By 
the local Theorem 4.3, we have a unique local flow at x,, which we 
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denote by f: 
B: J, x B,(x,)—- U, B(O, x) = x, 


defined for some open interval J, =(—a,a) and open ball B,(x,) of ra- 
dius a centered at x,. Let 6 be so small that whenever b — 6 <t <b we 
have 


a(t, Xo) © Bajg(X1). 
We can find such 6 because 


lim a(t, Xo) = x, 
tb 


by continuity. Select a point t, such that b—6<t,<b. By the hypo- 
thesis on b, we can select an open interval J, containing t, and an open 
set U, containing x, so that 


a: J, x U,;> /2(X1) 


maps J, x U, into B,.(x,). We can do this because a is continuous at 
(t,,X ), being in fact C? at this point. If |t-—t,|<a and xeU,, we 
define 


p(t, x) = B(t — t,, a(t,, x)). 
Then 


p(t,, x)= B(0, a(t,, x)) = = a(t,, Xx) 
and 


D, v(t, x) = D, B(t — 1, a(t,, x)) 
= f(B(t — ty, a(t, x))) 
= f(g(t, x). 


Hence both @, and a, are integral curves for f with the same value at f,. 
They coincide on any interval on which they are defined by Theorem 3.3. 
If we take 6 very small compared to a, say 6 < a/4, we see that @ is an 
extension of a to an open set containing (t,,x,), and also containing 
(b, x9). Furthermore, @ is of class C?, thus contradicting the fact that b 
is strictly smaller than the end point of J(x,). Similarly, one proves the 
analogous statement on the other side, and we therefore see that D(/) is 
open in R x U and that «@ is of class C? on D(f), as was to be shown. 


The idea of the above proof is very simple geometrically. We go as 
far to the right as possible in such a way that the given flow a is of class 
C? locally at (t,x). At the point a(b,x,) we then use the flow Pf to 
extend differentiably the flow a in case b is not the right-hand point of 
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J(xo). The flow B at a(b, x.) has a fixed local domain of definition, and 
we simply take t close enough to b so that PB gives an extension of a, as 
described in the above proof. 

Of course, if f is of class C®, then we have shown that « is of class C? 
for each positive integer p, and therefore the flow is also of class C™. 


XIV, §6. EXERCISES 


1. (Tate) Let E, F be complete normed vector spaces. Let f: E—- F be a map 
having the following property. There exists a number C > 0 such that for all 
x, y€ E we have 


[f(x + y) — f(x) — FO) S C. 


Show that there exists a unique linear map g:E-—F such that g—f 1s 
bounded for the sup norm. [Hint: Show that the limit 


g(x) = lim JO") 


n 
n> 2 


exists. | 


2. Generalize Exercise 1 to the bilinear case. In other words, let f: E x F>~G 
be a map and assume that there is a constant C such that 


f(x, + X25 y) — f(x1, y) — f(x2, y)| < C, 
| f(x, 4 + V2) — f(x, ¥1) — FO yal SC 


for all x, x,, x,¢E and y, y,, y,¢F. Show that there exists a unique 
bilinear map g: E x F >G such that f — g is bounded for the sup norm. 


3. Prove the following statement. Let B, be the closed ball of radius r centered 
at 0 in E. Let f: B. > E be a map such that: 
(a) | f(x) — f(y)| S lx — y| with O0< 6b <1. 
(b) | f(0)| Sr(l — 5). 


Show that there exists a unique point x € B, such that f(x) = x. 


4. With notation as in Exercise 3, let g be another map of B, into E and let 
c > 0 be such that |g(x) — f(x)| Sc for all x. Assume that g has a fixed point 
x,, and let x, be the fixed point of f. Show that |x, — x,| Sc/(1 — D). 


5. Let K be a continuous function of two variables, defined for (x, y) in the 
square axx<b and a<y<b. Assume that ||K|| < C for some constant 
C>0. Let f be a continuous function on [a, b] and let r be a real number 
satisfying the inequality 


1 
Ir] < Cb ay oD 


Show that there is one and only one function g continuous on [a,b] such 
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that 
b 


F(x) = g(x) +7 | K(t, x)g(t) dt. 


a 


6. Newton’s method. This method serves the same purpose as the shrinking 
lemma but sometimes is more efficient and converges more rapidly. It is used 
to find zeros of mappings. 

Let B, be a ball of radius r centered at a point x,¢ E. Let f:B.>E bea 
C? mapping, and assume that f” is bounded by some number C 2 1 on B.,. 
Assume that f’(x) is invertible for all x eB, and that |f’(x)'|<C for all 
x € B.. Show that there exists a number 6 depending only on C such that if 
| f(xo)| = 6 then the sequence defined by 


Xnt+1 = X_ — f' (Xn) "SF (Xn) 


lies in B. and converges to an element x such that f(x)=0. Hint: Show 
inductively that 


[X41 ~ Xn| < C\ f(x) |, 


[f(Xn4.)| SlXna, — x, | C, 
and hence that 
| f(x,)| < C3 +2442") San 


Xn4 _ Xa < CC3UF2+4 te 42") S20 

7. Apply Newton’s method to prove the following statement. Assume that 
f:U—-E is of class C* and that for some point x,¢U we have f(x.) =0 
and f’(x,) is invertible. Show that given y sufficiently close to 0, there exists 
x close to x, such that f(x) = y. [Hint: Consider the map g(x) = f(x) — y.] 

Note. The point of the Newton method is that it often gives procedure 

which converges much faster than the procedure of the shrinking lemma. 
Indeed, the shrinking lemma converges more or less like a geometric series. 
The Newton method converges with an exponent of 2". For an interesting 
application of the Newton method, see the Nash—Moser implicit mapping 
theorem [Nas], [Mo 2], [Ha]. See also the partial axiomatization which I 
gave in [La 4]. These show that the calculus in Banach spaces is insufficient 
and leads to calculus in Fréchet spaces, where the inverse mapping theorem 
and existence theorem for differential equations is much more subtle. 


8. The following is a reformulation due to Tate of a theorem of Michael Shub. 
(a) Let n be a positive integer, and let f: RR be a differentiable function 
such that f’(x) 2r> 0 for all x. Assume that f(x + 1) = f(x) +n. Show 

that there exists a strictly increasing continuous map a: R — R satisfying 


a(x + 1) = a(x) + 1 
such that 
f(a(x)) = a(nx). 


(Hint: Follow Tate’s proof. Show that f is continuous, strictly increasing, 
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10. 


and let g be its inverse function. You want to solve a(x) = g(a(nx)). Let 
M be the set of all continuous functions which are increasing (not neces- 
sarily strictly) and satisfying «(x + 1) = a(x) + 1. On M, define the norm 


lal] = sup a(x). 
$1 


S 
Osx 
Let T: M—M be the map such that 

(Tex)(x) = g(a(nx)). 


Show that T maps M into M and is a shrinking map. Show that M is 
complete, and that a fixed point for T solves the problem.] Since one can 
write 


nx = a” *(f(a(x))), 


one says that the map x +> nx is conjugate to f. Interpreting this on the 
circle, one gets the statement originally due to Shub that a differentiable 
function on the circle, with positive derivative, is conjugate to the n-th 
power for some n. 

(b) Show that the differentiability condition can be replaced by the weaker 
condition: There exist numbers r,, r, with 1<r, <r, such that for all 
x 2 0 we have 

rs S f(x +s) — f(x) Shs. 


. Let M be a complete metric space (or a closed subset of a complete normed 


vector space if you wish), and let S be a topological space. Let T:S x M>M 
be a continuous map, such that for each ueS the map T;:M—M given 
by 7,(x) = T(u, x) is a shrinking map with constant K,, 0< K, <1. Assume 
that there is some K with 0< K <1 such that K,<K for all ueS. Let 
~:S—M be the map such that g(u) is the fixed point of Tj. Show that 
@ is continuous. 


Exercises 10 and 11 develop a special case of a theorem of Anosov, by a 
proof due to Moser. 

First we make some definitions. Let A: R? > R? be a linear map. We say 
that A is hyperbolic if there exist numbers b> 1, c <1, and two linearly 
independent vectors v, w in R* such that Av=bv and Aw=cw. AS an 
example, show that the matrix (linear map) 


2 1 
A= 
(2) 
has this property. 


Next we introduce the C' norm. If f is a C' map, such that both f and 
f’ are bounded, we define the C' norm to be 


IF ll, = max(I fll, IF, 


where || || is the usual sup norm. In this case, we also say that f is 
C'-bounded. 
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11. 


The theorem we are after runs as follows: 


Theorem. Let A: R* > R? be a hyperbolic linear map. There exists 6 hav- 
ing the following property. If f: R? > R* is a C' map such that 


If — All, <4, 


then there exists a continuous bounded map h:R* —>R? satisfying the 
equation 

foh=ho. 
First prove a lemma. 


Lemma. Let M be the vector space of continuous bounded maps of R* into 
R2. Let T: M—M be the map defined by Tp = p— A™'0 po A. Then T is 
a continuous linear map, and is invertible. 


To prove the lemma, write 
p(x) = p°(x)v + p (x)w 
where p* and p” are functions, and note that symbolically, 
Tp* = pt —b"'p* oA, 


that is Tp* =(1—S)p* where ||S|| <1. So find an inverse for T on p*. 
Analogously, show that Tp’ =(I — Sg')p” where ||So|| <1, so that S)T = 
So —I is invertible on p-. Hence T can be inverted componentwise, as it 
were. 

To prove the theorem, write f = A +g where g is C’-small. We want to 
solve for h=I+p with peM, satisfying foh=hoA. Show that this ts 
equivalent to solving 

Tp = —A‘ogoh, 
or equivalently, 
p= —T (At ogo(I + p)). 


This is then a fixed point condition for the map R: M > M given by 
R(p) = —T"*(A™* ogo(I + p)). 


Show that R is a shrinking map to conclude the proof. 


One can formulate a variant of the preceding exercise (actually the very case 
dealt with by Anosov—Moser). Assume that the matrix A with respect to the 
standard basis of R* has integer coefficients. A vector z¢R?* is called an 
integral vector if its coordinates are integers. A map p: R* > R? is said to be 
periodic if p(x + z) = p(x) for all x € R? and all integral vectors z. Prove: 


Theorem. Let A be hyperbolic, with integer coefficients. There exists 6 
having the following property. If g is a C', periodic map, and |\g||, <6, and 
if f =A-+g, then there exists a periodic continuous map h satisfying the 
equation 


foh=hod. 
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Note. With only a bounded amount of extra work, one can show that the 
map hi itself is C°-invertible, and so f=ho Aoh'. 


12. (a) Let f be a C' vector field on an open set U in E. If f(x.) =0 for some 
Xo € U, if «: JU is an integral curve for f, and there exists some f) € J 
such that «(t 9) =x), show that a(t)= x, for all te J. (A point x» such 
that f(x.) = 0 is called a critical point of the vector field.) 

Let f be a C' vector field on an open set U of E. Let a: J—U be an 
integral curve for f. Assume that all numbers t > 0 are contained in J, 
and that there is a point P in U such that 


(b 


~~’ 


lim a(t) = P. 


to 


Prove that f(P)=0. (Exercises 12(a) and 12(b) have many applications, 
notably when f = grad g for some function g. In this case we see that P 
is a critical point of the function g.) 


13. Let U be open in the (real) Hilbert space E and let g: U>R be a C? 
function. Then g’: U > L(E,R) is a C! map into the dual space, and we 
know that E is self dual. Thus there is a C' map f: U > E such that 


g'(x)v = Xv, f(x)> 


for all xe U and ve E. We call f the gradient of g. 

Let g: UR be a function of class C*. Let x) ¢ U and assume that xo is 
a critical point of g (that is g’(x9) = 0). Assume also that D?g(x,) is negative 
definite. By definition, take this to mean that there exists a number c > 0 
such that for all vectors v we have 


D?g(xo)(v, v) S$ —e|v)?. 
Prove that if x, is a point in the ball B.(x9) of radius r, centered at xg, and if 


r is sufficiently small, then the integral curve « of grad g having x, as initial 
condition is defined for all t = 0 and 


lim a(t) = Xo. 


to 


Hint: Let W(t) = (a(t) — x9)-(a(t) — xo) be the square of the distance from 
a(t) to xX). Show that yw is strictly decreasing, and in fact satisfies 


y'(t) = —cp(t). 
Divide by w(t) and integrate to see that 
log w(t) — log w(0) S —ct. 
14. Let U be open in E and let f: U- E be a C’ vector field on U. Let x» €U 


and assume that f(x,) =v #0. Let @ be a local flow for f at x9. Let F bea 
subspace of E which is complementary to the one-dimensional space gener- 
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ated by v, that is the map 
RxF-E 


given by (t, y)k>tv + y 1s an invertible continuous linear map. 

(a) If E=R"” show that such a subspace exists. (The general case can be 
proved by the Hahn—Banach theorem.) 

(b) Show that the map f:(t, y)k>(t,x>9 + y) is a local C' isomorphism at 
(0, 0). Compute Df in terms of D,a and D,«. 

(c) The map oa: (t, y)k> x9 + y + tv is obviously a C' isomorphism, because it 
is composed of a translation and an invertible linear map. Define locally 
at x, the map g by gy = Boa, so that by definition, 


O(X9 + y + tv) = a(t, Xo + y). 
Using the chain rule, show that for all x near x, we have 


Doe(x)v = f(g(x)). 


In the language of charts (Chapter XXI) this expresses the fact that if a 
vector field is not zero at a point, then after a change of charts, this 
vector field can be made to be constant in a neighborhood of that point. 


15. Let J be an open interval (a, b) and let U be open in E. Let f:Jx U-E 
be a continuous map which is Lipschitz on U uniformly for every compact 
subinterval of J. Let « be an integral curve of f, defined on a maximal open 
subinterval (a), bo) of J. Assume: 

(a) There exists ¢ > 0 such that the closure a((b, — &, by)) is contained in U. 
(b) There exists C > 0 such that | f(t, «(t))| < C for all t in (by — ¢, bo). 
Then by = BD. 


16. Linear differential equations. Let J be an open interval containing 0, and let 
V be open in a Banach space E. Let L be a Banach space. Let A: J x VoL 
be a continuous map, and let L x E-E be a continuous bilinear map. 
Let w,oéE. Then there exists a unique map 4: J x VE, which for each 
x € V is a solution of the differential equation 


D, A(t, x) = A(t, x)A(t, x), A(0, x) = Wo. 


This map / is continuous. [Hint: Use Exercise 15. We see that in the linear 
case, the integral curve is defined over the whole interval J.] 


17. Let U be open in a Banach space E and let f: UE be a C' vector field. 
Assume that f is bounded. Let « be an integral curve for f, and let J be its 
maximal interval of definition. Suppose that J does not contain all positive 
real numbers, and let b be its right end point. Show that 


lim a(t) 


tb 


exists, and that it is a boundary point of U. Cf. [La1] and [La 2] to see 
Exercises 12-17 worked out. 


PART FIVE 


Functional Analysis 


In this part, we present some basic and substantial results of functional 
analysis, which are extremely widely used. The part splits into essentially 
independent considerations dealing with Banach spaces in general, and 
then with the special case of hermitian operators in Hilbert space. 

First we have a chapter on some general theorems which extend to 
Banach spaces some basic algebraic results on finite dimensional spaces, 
taking into account the linear topology. The algebraic theorems concern- 
ing the existence of complementary subspaces, or criteria for a linear map 
to be surjective, need a more systematic study in light of the additional 
structure provided by the norms and the continuity of linear maps. 

The rest of the part handles systematically the spectrum of an opera- 
tor in various contexts. First we deal with the spectrum in a fairly 
general context of Banach algebras. Then we study spectral decomposi- 
tions for specific types of operators, starting with compact operators 
which are closest to those in finite dimensional spaces. Then we go into 
the systematic study of hermitian operators in Hilbert space. Note that 
the proof of the spectral theorem in this context is self-contained, inde- 
pendent of the chapter on Banach spaces. Knowing just the self duality 
of Hilbert space and the existence of orthogonal complements for closed 
subspaces constitutes a sufficient tool for the applications to the rest of 
the book. The spectral theorems are included so that readers can push 
forward in these particular directions if they are so inclined by taste 
rather than by formal requirements for the present basic course. 

The functional analysis is principally concerned with the study of a 
space with an operator, giving as simple a description as possible for the 
way in which this operator operates. The two spectral theorems give 
examples of the standard manner in which such a description can be 
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made, i.e. either by describing a basis for the space on which the effect 
of the operator is obvious, or by giving a structure theorem for the 
algebra generated by the operator. These two ways permeate functional 
analysis. 


CHAPTER XV 


The Open Mapping Theorem, 
Factor Spaces, and Duality 


XV, §1. THE OPEN MAPPING THEOREM 
We begin with a general theorem on metric spaces. 


Theorem 1.1 (Baire’s Theorem). Let X be a complete metric space, and 
assume that X is the union of a sequence of closed subsets S,. Then 
some S, contains a non-empty open ball. 


Proof. Suppose that this is not the case. We find x, in the comple- 
ment of S, (which cannot be the whole space) and some closed ball 
B, (x) centered at x, of radius r, > 0, contained in this complement. By 
assumption, there is some x, in B, (x,) contained in the complement of 
S, and some closed ball B,.(x.) contained in B, (x,), and which lies in 
the complement of S,. We continue inductively using a sequence r,, r2, 

such that r,>0 and r,-0. We thus obtain a sequence of closed 
balls | 
B, (x1) - B, (x2) id B, (Xn) 


such that B, (x,) is disjoint from S,U-:-US,. We then select x,,, and 
B, (Xn+1) © B,(X,) disjoint from S,,,. Then the sequence {x,} is a 
Cauchy sequence, converging to a point x, and x lies in every B, (x,) for 
all n. Hence x does not lie in S, for any n, contradicting the hypothesis 
that the union of all S, is equal to X. This proves Baire’s theorem. 


Corollary 1.2. Let X be a complete metric space, and {U,} a sequence 
of open dense sets. Then the intersection (\ U, is not empty. 


Proof. Take the complement of the sets in Baire’s theorem. 
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Theorem 1.3 (Open Mapping Theorem). Let E, F be Banach spaces, 
and let gp: E-F be a continuous linear map, which is surjective. Then 
@ iS open. 


Proof. For s>0O we denote by B, the open ball of radius s in E£, 
centered at the origin, and by C, the open ball in F centered at the 
origin. Let S, = @(B,). Then S, is closed, and the union of all sets S, is 
equal to F. By Baire’s theorem, some ¢(B,) contains a set which is dense 
in some non-empty open ball V in F, centered at a point y. If y = @(x), 
for some xe E, then translating by y, we conclude that there is some 
k = 1 and r>O such that @(B,,) contains a set which is dense in C,. By 
homogeneity (i.e. the fact that B,, = tB, for s, t > 0), it follows that this 
last statement holds if we replace r by any number s > 0. We shall prove 
that in fact, m(B,,) contains C,. Select 0 <6 < 1, and let ye F, |y| <r. 
There exists x, € E with |x,| < kr such that 


ly — p(x,)| < or. 
Inductively, there exist x,, ...,x, € E such that |x,| < ké""'r and 
ly — P(X) — +++ — G(X,)| < 0"F. 


Then the sum x, +-:: +x, converges to an element x such that y = (x), 
and furthermore, 
|x| < kr/(1 — 0). 


Hence (B,,) contains the ball C,,_5, of radius r(1 — 0). This is true 
for every 6 > 0 whence our assertion follows that g(B,,) contains C,. 

Now to conclude the proof of Theorem 1.3, let U be an open set in E, 
and let xe U. Let B be an open ball centered at the origin in E 
such that x + BcU. Then (x) + g(B) is contained in g(U). But o(B) 
contains an open ball centered at the origin in F. This proves that g(U) 
is open, and concludes the proof of the open mapping theorem. 


Corollary 1.4. Let go: E-—F be a continuous linear map, which is 
bijective. Then ¢ is a toplinear isomorphism. 


Proof. The inverse of @ is also continuous, so we are done. 


Corollary 1.5. Let F, G be closed subspaces of E such that F+G=E 
and F ~G = {0}. Then the map 


FxGroE 


such that (x, y)++x + y is a toplinear isomorphism. 
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Proof. It is continuous and bijective, so that Corollary 1.4 applies. 


In Corollary 1.5, we shall also say that E is the direct sum of F, G 
and we write 


E=FQOG. 


We say that F, G are complementary subspaces. 
Let E be a Banach space and F a closed subspace. Then we can 
define a norm on the factor space E/F by 


|x + F| = inf |x + yl. 


yeF 


Then E/F is complete under this norm, i.e. is also a Banach space. To see 
this, let 
go: E> E/F 


be the canonical map which to each xe€E associates the coset (x) = 
x + F. Then @g is a continuous linear map, and 


|p(x)| S |x]. 


Let {&,} be a Cauchy sequence in E/F. Taking a subsequence if neces- 
sary, we may assume without loss of generality that 


1 
len ~ Cn-1\ < on 


We find inductively elements x, € E such that o(x,) = €, and such that 


Xn _ Xn-1| < an 


Indeed, suppose that we have found x,, ...,x, satisfying these conditions. 


1 
Since |¢,4, — ¢,| < raze we can find y such that 


1 
P(Y) = Snt1 — Sn and ly|< anti 


by the definition of the norm on E/F. We let x,,, = y+ x, to achieve 
what we want. Then the sequence {x,} is a Cauchy sequence in E, and 
converges to some element x. It follows that @(x,) = ¢, converges to 
(x) since @ is continuous, as was to be shown. 


Let go: EG be a continuous linear map where FE, G are Banach 
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spaces. The image @(E) is a subspace, which is not necessarily closed. 
Let F be the kernel of @. Then we have the usual linear map 


E/F >G 


induced by @, namely the map such that x + Fr e(x) = o(x + F). This 
map is in fact continuous, because there exists C >0 such that for all 
x € E we have 

|p(x)| S$ C|x + FI 


Since (x) = g(x + y) for all y € F it follows that 
lo(x)| S$ Clx + F| 


whence the continuity of E/F +>G. Consequently, by Corollary 1.4 of 
the open mapping theorem, if @ is surjective, it follows that the map 
E/F — Gis a toplinear isomorphism. 

Let E be a vector space and F a subspace. If E/F has finite dimen- 
sion, then we say that F has finite codimension, and we call dim E/F its 
codimension. | 


Corollary 1.6. Let E be a Banach space and F a closed subspace of 
finite dimension or finite codimension. Then F has a complementary 
closed subspace. 


Proof. Assume that F is finite dimensional. The proof is then inde- 
pendent of the open mapping theorem, namely we let {@,,...,9,} be a 
basis of the dual space of F. By Hahn—Banach, we extend each @g, to a 
functional on E, denoted by the same letter, and we map 


XtH+(Q,X, ..-,Q,X) 


for xeé E. Let G be the kernel of this map. Then G is closed, and it 1s 
immediately verified that G is a complement of F. 

Next, assume that F has finite codimension, and let {y,,...,y,} be a 
basis of E/F. Let x,, ...,x, be elements of E mapping into y,, ...,y, 
respectively in the natural map 


E> E/F. 


Let G be the space generated by x,,...,x,. Then G is finite dimensional, 
hence closed, and F 1G = {0} while F + G=E. We can apply Corollary 
1.5 to conclude the proof of this case. 


Later in discussing Fredholm operators, we shall also need the follow- 
ing completely elementary fact: 
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Proposition 1.7. If F is closed in E, and E/F is finite dimensional, and 
if G is a subspace of E such that F < Gc E, then G is closed. 


Proof. The image of G in the factor space E/F is in a finite dimen- 
sional space, hence closed. Since G is the inverse image in E of its image 
in E/F, it follows that G is closed. 


Corollary 1.8. Let E, G be Banach spaces. Let g: EG be a continu- 
ous linear map such that the image @(E) is finite codimensional. Then 
~(E) is closed. 


Proof. We can find in the usual way (as in Corollary 1.6) a finite 
dimensional subspace F of G such that G = g(E) + F. Of course, so far, 
this is an algebraic direct sum, not yet topological. Factoring out the 
kernel of ~, we may assume without loss of generality that @ is injective. 
We compose ¢ with the natural map G- G/F. Then the composite 


E3G4G/F 


is a bijective continuous linear map of E on G/F, hence is a toplinear 
isomorphism by Corollary 1.4. Hence the inverse map (Wo) is con- 
tinuous, and therefore so is the map 


po(pog)', 


which maps G/F on (E). Hence @(E) is toplinearly isomorphic with 
G/F. Since G/F is complete, it follows that @(E) is complete, and conse- 
quently @(E) is closed in G, as was to be shown. 
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We could now deal with either the real or complex case. We deal with 
the latter, since it is useful to get used to the complex conjugation which 
occurs, and introduces only a change of notation. 

Let E be a Banach space over the complex numbers. We let E* be 
the space of anti-linear maps go: EC, i.e. continuous maps which are 
R-linear and satisfy 


p(ax) = x(x). 
Elements of this space will be called anti-functionals or semi-functionals. 


This space is obtained from the dual space E’ very simply, namely if @ is 
a functional in E’, then the map @ defined by 


@(x) = —(x) 
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is an anti-functional, ic. an element of E*, and conversely. We shall 
apply to elements of E* certain results proved for elements of E’, e.g. the 
Hahn-—Banach theorem. Let E, F be Banach spaces, and let 


u: E> F 
be a continuous linear map. Then u induces a map 


u*: F* + E* 
such that 


Pr>@P ou, 


and it is clear that u* is linear and continuous. It is convenient here to 
use a notation as in Hilbert space. We define a map 


Ex E*¥>C 
by 
(x, p)r> <x, @> = o(X). 
This map is continuous sesquilinear, and we shall see that it behaves very 
much like the scalar product of Hilbert space for the basic formalism of 
duality. 
First the remark that the map 
ur>u* 
is anti-linear from L(E, F) to L(F*, E*). By definition, we have 


(ux, P) = <x, u*@) 


for all xe E, pe F*. Thus we call u* the adjoint of u. We note that u* 
is the unique element of L(F*, E*) which satisfies this formula. We have 


To prove this, observe that for any g € F* we have 


|(u* p)(x)| = |p(ux)] S lel ful |x! 


so that |u*| < |u|. Conversely, for each x € E, by the Hahn—Banach theo- 
rem, there exists @ ¢ F* such that |g(ux)| = |ux|, and |g@| <1. Then for 
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this ~, we get 
|ux| = |p(ux)| = |(u*@)(x)| S |u*||@l |x| S |u*| [xl 


Hence |u| < |u*|, thus proving our assertion. 

We have the following duality between spaces, subspaces, and factor 
spaces. Let F be a closed subspace of E. We denote by F~ the set of all 
elements @ € E* such that g(F)=0. (This is similar to the situation in 
Hilbert space, but here we have the natural map 


Ex E*¥35C 


instead of the hermitian product of Hilbert space.) Then F~ is clearly a 
closed subspace. We have a natural continuous linear map 


E* — F* 


by restriction, i. each g@ € E* induces by restriction an element of F*. 
The kernel is precisely F*~. Furthermore, our map is surjective, because an 
anti-functional wy on F can be extended to an anti-functional g on E by 
the Hahn—Banach theorem. Hence we have a natural toplinear isomorphism 


(2) E*/F+ 5, F*, 


We observe that the notion of perpendicularity can be defined on the 
other side as well, ic. given a subset S of E*, we let S*+ be the set of 
x € E such that <x, > = 0 for all pe S. Then S* is a closed subspace of E. 

We have a natural toplinear isomorphism 


(3) F+5(E/F)*. 


Indeed, each ge F defines an element of (E/F)* since <F, o> =0. It is 
clear that this map induces our stated isomorphism. 
Let F be a subspace of E. Then 


(4) Ftt=F. 


Indeed, it is clear that F < F++, and F‘t+ is closed so that Fc Ftt. 
Conversely, suppose that x ¢ F. Then there is some anti-functional g@ on 
E/F such that o(x) #0 by the Hahn—Banach theorem. Hence x ¢ F+°*, 
thus proving our assertion. 

We also have the duality associated with a continuous linear map 


u:E—>G 
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namely 

(5) Ker u* = (Im u)-, 

and if the image of u is closed, then so is the image of u* and 

(6) Im u* = (Ker u)-. 

We leave (5) as an exercise, and prove (6). We have for xe E: 
<x, u*G*) = 0 if and only if <ux, G*») = 0. 


Hence Im u* c (Ker u)~. Conversely, let g ¢ E* and g 1 Keru. We have 
a toplinear isomorphism 


o: E/Ker u > u(E) 


by Corollary 1.4 (o is continuous bijective). We view @ as an anti- 
functional on E/Keru, and then goo! is an anti-functional on u(E), 
which is a closed subspace of G. We can extend goo" to an anti- 
functional W of G by the Hahn—Banach theorem. Then it is clear that 
u* = m, whence ge Imu*. This proves that Im u* = (Ker u)*, and in 
particular proves that Im u* is closed, thus proving (6). 


In particular, if again the image of u is closed, then we have toplinear 
isomorphisms 


(7) Ker u* ~ (E/uE)*, 
(8) (Ker u)* ~ E*/u*G*, 


in a natural way. 


The reader acquainted with the language of exact sequences will see 
that our results can be expressed as follows. If 


0O-F>-E>-G 0 
is an exact sequence of Banach spaces, then the adjoint sequence 


O-— F* — E*¥ ~— G*<— 0 


is also exact. 
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XV, §3. APPLICATIONS OF THE OPEN 
MAPPING THEOREM 


The results of this section will not be used at all throughout the rest of 
this book, and are included only for the sake of completeness. The first 
two give criteria for a linear map to be continuous. 

As usual, if @: EF is a map, we define the graph of @ to be the set 
of all points (x, o(x)) in E x F. If @ is linear, then the graph of @ is 
obviously a subspace of E x F. 


Theorem 3.1 (Closed Graph Theorem). Let 0: E-F be a linear map 
from one Banach space into another, and assume that the graph is 
closed. Then @ is continuous. 


Proof. Let G be the graph of g, so that by assumption G is a closed 
subspace of E x F. The projection 


G->E  givenby = (x, @(x)) > x 


is obviously continuous and bijective. By Corollary 1.4 of the open map- 
ping theorem, it follows that this projection is a toplinear isomorphism, 
and thus has a continuous linear inverse. If we compose this continuous 
linear inverse with the projection on F, then we obtain 9, thus proving 
that @ is continuous, as desired. 


Theorem 3.2 (Principle of Uniform Boundedness). Let E, F be Banach 
spaces, and let {T;};-; be a family of continuous linear maps from E 
into F. Assume that for each x € E the set {T;x};-, is bounded. Let B 
be a bounded subset of E. Then the set 


U TB) 


is bounded. 


Proof. For each positive integer n let C, be the set of all xe E such 
that |x| <n for all ie J. Since each 7; is continuous, it follows that C, 
is closed. By assumption, we have 


By Baire’s theorem (Theorem 1.1) it follows that some C,, contains an 
open ball B(x.) with r>0. This means that if |x| <r, then for all ie/ 
we have 

| Ti(xXo + x)| Sm, 
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whence 
| Ti(x)| S | F(x + Xo)| + | Ti(Xo)| 


< 2m. 
Our theorem follows by homogeneity. 


Corollary 3.3. Let {T,} be a sequence of continuous linear maps of E 
into F. Assume that for each x € E the limit 


Tx = lim T,x 


n—- 0 
exists. Then T is a continuous linear map of E into F, and 


lim 7,x = 0 


x70 


uniformly in n. 


Proof. It is clear that T is linear. For each x € E, the sequence {T,x} 
converges and hence is bounded, so that we can apply the theorem, and 
our corollary follows at once since we see that T is continuous at 0. 


The next two theorems provide one type of generalization of the 
inverse mapping theorem, to surjective mapping theorems. 


Theorem 3.4. Let E, F be Banach spaces. The subset of L(E, F) 
consisting of surjective maps is open in L(E, F). 


Proof. Let A: EF be a continuous linear map and assume that 4 is 
surjective. By the open mapping theorem, and homogeneity, there exists 
C>0 having the following property. Given yeF with |y| <1, there 
exists x € E such that Ax = y and |x| < Cly|. Changing the norm on F 
to an equivalent norm, we may assume without loss of generality that 
C=1. LettO<r<1. Let pe L(E, F) be such that |A— g| Sr. We shall 
prove that @ is surjective, and it will suffice to prove that @ maps the 


1 
ball of radius - in E onto the ball of radius 1 in F. Let y=y,eF 


and |y,| <1. By what we have just remarked, there exists x, € E such 
that Ax, = y, and |x,| <1. Let 


Vo = AX, — QXy. 


Then 
ly2| = |Ax, — ox,|S |A — o||x,| Sr. 
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There exists x, € E such that Ax, = y, and |x,| <r. Let 


V3 = AX. — OX. 
Then 


ly3| = |Ax. — px2| Sr|x.|S r?, 


There exists x, ¢E such that Ax, = y3 and |x,| <r?*. Continuing induc- 
tively, we find x, such that Ax, = y, and 


Vati = AXy — QXn> Yn+1| Ss r", IXn+1 < r”. 
Then 


Vy = OX, te + PX, TH Vas - 
If we let 


and mx = y,, thus proving our theorem. 


then |x| < ; 

l-—r 
Theorem 3.5 (Surjective Mapping Theorem (Graves)). Let U be open 
in a Banach space E. Let f: U > F be a C' map into a Banach space 
F. Let x,€E. If f'(x9) is surjective, then f is locally open in a 
neighborhood of X,. More precisely, there exists an open neighbor- 
hood V of Xo contained in U having the following property. For each 
x €V and open ball B, centered at x, contained in V, the image f(B,) 
contains an open neighborhood of f(x). 


Proof. After a translation, we may assume that x) = 0 and f(x,) = 0. 
Also by the preceding theorem, it will suffice to prove that if B is an 
open ball centered at 0 in E, then f(B) contains an open neighborhood of 
Oin F. Let d= f’(0). By the open mapping theorem and homogeneity, 
there exists C > 0 having the following property. Given ye F with |y| < 1, 
there exists x € E such that 4x = y and |x| < C|y|. Changing the norm 
on F to an equivalent norm, we may assume without loss of generality 
that C=1. Let O<r<1. Taking B with sufficiently small radius, by 


B: 


the mean value theorem we have for x, zé€ i 
—r 


(*) f(x) — f(z) — A(x — 2)| Srlx —2Z|. 


It will suffice to prove that 
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and we now prove this. Let x, ¢ Band Ax, = y,. Let y. = Ax, — f(x)). 
By (*) we find 


[vol = |Axy — f(x1)| Srlxyl. 
There exists x, with |x,| <r|x,| such that Ax, = y,. Then 


x, +x,E(14+r)B, 
and by (*) we find 


JAx, — f(x, + X2)| = [f(%1) — fe + X2) + XQ! 


<r|x|<r7|x,|- 


Let y; = Ax, — f(x, + x2). There exists x; with |x3| <|y3| S1r7|x,|, such 
that Ax; = y,. Then 


Xp, +X, +x3,E(1+7r+r7)B. 
We have 


Iya — f(x, + X2 + x3) 
= |Ax, — f(x, + Xz) + f(%1 + X2) — f(x + X2 + X3)I, 
= |Ax3 + f(x, + x2) — f(x, + X2 + x3). 


But by (+), 
f(X1 + X2) — f(%y + X2 + X3) = AX3 + 03, with |v3| Sr|xs|, 
so that we get 
yi —f(%1 +X. +3) 8 r°|x,|. 
Inductively, we find x, such that Ax, = y,, |x,| <r" '|x,|, and 
yi —f(%1 +77 + X,) Sr" |x|. 
We can then find x,,, such that |x,.,| S7r"|x,| and 


AXn+1 = Vi — f(x, +++ X,). 
Then 
lyin —f(%y + + Xn)! 
= |y — f(xy HoH Xq) + SK Ho + Xn) — FO Ho + Xn) 


<r|Xpei1) Sr"7"|x,]. 
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We let 


and we see that f(x)=y,. Furthermore, xe B, thus proving our 


—T 
theorem. 


CHAPTER XVI 


The Spectrum 


In this chapter we give basic facts about the spectrum of an element in a 
Banach algebra. Under certain circumstances, we represent such an alge- 
bra as the algebra of continuous functions on its spectrum, which is 
defined as the space of its maximal ideals (or the space of characters), to 
be given the weak topology. In the next chapters, we shall deal with 
spectral theorems corresponding to more specific examples of Banach 
algebras. The proofs in the later chapters are independent of those in the 
present chapter, except for Theorem 1.2 and its corollaries. Thus the rest 
of this chapter may be bypassed. On the other hand, the general repre- 
sentation of a Banach algebra as an algebra of continuous functions on 
the spectrum gives a nice application of the Stone—Weierstrass theorem, 
and is useful in other contexts besides the spectral theorems which form 
the remainder of the book, so I have included the basic results to pro- 
vide a suitable background for applications not included in this book. 
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Let A be a Banach algebra over the complex numbers. We assume that 
A has a unit element e. Let ve A. The set of complex numbers z such 
that v — ze is not invertible is called the spectrum of v and is denoted by 
a(v). We shall investigate special cases in Chapters XVII and XVIII. 
Here we shall prove that the spectrum is not empty. Before doing that, 
we make a simple remark concerning the spectrum. 


Theorem 1.1. The spectrum of an element v € A is a closed and bounded 
set in C. In fact, if z is in the spectrum, then |z| S |v]. 
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Proof. We show that the complement is open. Let z) be a complex 
number such that v — Zoe is invertible. If z is sufficiently close to Zo, 
then v — ze is invertible because the set of invertible elements is open, so 
the spectrum is closed. Furthermore, if |z| > |v|, then |v/z| < 1, and hence 
e — v/z is invertible by Theorem 2.1 of Chapter IV, §2, so that v — ze is 
also invertible, as contended. 


Theorem 1.2. Let A be a commutative normed algebra over the real 
numbers, with unit element e. Assume that there exists an element j € A 
such that j*? = —e. Let C=R+ Rj. Given ve A, v £0, there exists an 
element c EC such that v — ce is not invertible in A. 


Proof (Tornheim). Assume that v — ze is invertible for all ze C. Con- 
sider the mapping f: C > A defined by 


f(z) = (v — ze)™. 


Then f is continuous, and for z 4 0 we have 


fie) = 2 Mev — ot =H ) 


z\(v/z) —e 


From this we see that f(z) approaches 0 when z goes to infinity in C. 
Hence the map z+>|/f(z)| is a continuous map of C into the real numbers 
= 0, is bounded, and is small outside some large circle. Hence it has a 
maximum, say M. Let D be the set of elements z eC such that | f(z)| = 
M. Then D is not empty, D is bounded, and is closed. We shall prove 
that D is open, whence a contradiction. 

Let cy be a point of D, which, after a translation, we may assume to 
be the origin. We shall see that if r is real > 0, then all points on the 
circle of radius r lie in D. Indeed, consider the sum 


where « is a primitive n-th root of unity, say «=e?"". Let t be a 
variable. Taking formally the logarithmic derivative of 


shows that 
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and hence, dividing by n and by t""', and substituting v for t, we obtain 


I 


~ yp —r(r/vy 


S(n) 


If r is small (say |r/v| < 1), then we see that 


lim |S(n)| = |v~*| = M. 


no 


Suppose that there exists a complex number € of absolute value 1 such 
that 
1 


< M. 
v—¢ér 


Then there exists an interval on the unit circle near &, and there exists 
é > 0 such that for all roots of unity ¢ lying in this interval, we have 


1 
v—¢r 


<M —e6. 


(This is true by continuity.) Let us take n very large. Let b, be the 
number of n-th roots of unity lying in our interval. Then b,/n is approxi- 
mately equal to the length of the interval (over 27). We can express S(n) 
as a sum 


1 1 1 
S(n) = ap b—akr du v— (| 


the first sum )), being taken over those roots of unity a* lying in our 
interval, and the second sum being taken over the others. Each term in 
the second sum has norm s M because M is a maximum. Hence we 
obtain the estimate 


1 
|S(n)| S hl Col + onld 
<*(6,(M — 6) + (2 b,)M) 
b, 
<M — ne 


This contradicts the fact that the limit of |S(n)| is equal to M, and proves 
our theorem. 


Corollary 1.3 (Gelfand—Mazur Theorem). Let K be a normed field over 
the reals. Then K =R or K =C. 
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Proof. Assume first that K contains C. Then the theorem implies that 
K=C. If K does not contain C, in other words does not contain a 
square root of —1, we let E = K(j) where j* = —1. (One can give a 
formal definition of the field E as one defines the complex numbers from 
ordered pairs of real numbers. Thus we let E consist of pairs (x, y) with 
x, yé K, and define multiplication in E as if (x, y)=x + yj. This makes 
E into a field.) We can define a norm on E by putting 


[x + yi] = |x| + |y| 


for x, ye K. Then E is a normed R-space. Furthermore, if z= x + yj 
and z’= x’ + yj are in E, then 


|2z"| = |xx" — yy'| + |xy" + x’y| 
S |xx'] + lyy'] + [xy] + [x'y| 
S |x| ]x'] + lylly] + pty) + "Ty 
S (Ix + [yD (xT + [yD 
S |z2||z'|. 


Hence we have defined a norm on E. We can apply Theorem 1.2 to 
conclude the proof. 


Corollary 1.4. The spectrum of an element in any complex Banach 
algebra (commutative or not) with unit element is not empty. 


Proof. If A is a Banach algebra with unit, and if ve A, then the 
closure of the algebra generated by e and v is a commutative Banach 
algebra. Hence we can apply the theorem to it. 


We shall see later in this chapter that under fairly general conditions, 
a Banach algebra is isomorphic to the algebra of continuous functions on 
a compact set. This set is obtained in a natural way, namely it is the 
maximal ideal space of A. 

Finally, we remark that the fact that the spectrum is not empty can 
also be proved by quoting an elementary theorem about analytic func- 
tions of a complex variable, namely that a bounded analytic function is 
constant. The proof runs as follows. Suppose that we have an element v | 
in our algebra such that (v — ze)! exists for all complex z. Then cer- 
tainly the map 


ztr>(v — ze) 


is not constant, and hence there exists a functional 4 on the algebra such 
that the map 


zr A[(v — ze) '] = f(z) 
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is not constant. However, it is immediately verified that this map is 
differentiable (ie. complex differentiable), and since (v—ze)'—+0 as 
z— 00, it follows that f is bounded, a contradiction which proves what 
we wanted. 

We will need to consider Banach-valued analytic functions somewhat 
further in the context of the remark above. Let E be a Banach space, 
and let U be open in C. Let f: UE be a mapping. We define f to be 
analytic if given z) ¢ U there exist elements a, ¢ E such that f(z) 1s equal 
to a convergent power series 


fle) = ¥) a,(e — 20) 


for z in some disc of positive radius centered at z). As usual in complex 
analysis, convergent here means absolutely convergent. Observe that for 
every functional 1 on E, we have 


do fle) = Y, Ma,)(z — zo) 


Thus Jo f is an analytic function in the usual sense of complex analysis. 
By the Hahn-Banach theorem, and the usual uniqueness of a power 
series expansion for complex valued analytic functions, we also get the 
uniqueness for Banach-valued analytic functions, and we get the uni- 
queness of analytic continuation. Furthermore, a number of theorems 
from complex analysis are valid in the Banach-valued case, and their 
proofs can be carried out by using the Hahn-—Banach theorem. For 
instance for an analytic function on a closed disc of radius R, we have 
Cauchy’s integral formula 


fa=-t.{ £0 


2ni Jc G —2Z 


dC, 


where C is the circle of radius R centered at the origin. 

Indeed, the integral on the right is the simple-minded integral as 
defined in calculus, Chapter XIII, §1. To prove that both sides of the 
formula are equal, by the Hahn—Banach theorem it suffices to prove that 
applying every functional to one side is equal to the functional applied to 
the other side. We can use Proposition 1.1 of Chapter XIII to see this. 

The Cauchy integral representation allows us to prove in the Banach- 
valued case that if f is analytic on a closed disc of radius R, then the 
radius of convergence of its power series centered at 0 is at least equal to 
R. Indeed, the usual proof works by using the geometric series, writing 


1 ot t 18 (2\" 
attra (2) 
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Thus we get the same integral formula for the coefficients of the power 
series for f at 0 as in the complex valued cases. 

We leave it to the reader to verify that if A is a Banach algebra with 
unit e, and xe A, then the function 


f(z) = (x — ze)? 


is analytic on the open set of complex numbers z such that x — ze is 
invertible. 

The same arguments as in ordinary complex analysis show that the 
radius of convergence of a power series ) a,z" is 1/lim sup |a,|'". We 
shall be concerned with a special power series as follows. 

Let A be a Banach algebra with unit element e. We denote by a(x) 
the spectrum of an element xe A. We are interested in bounds for the 
spectrum. The essential structure will be derived from the power series 


y x"2" 


in the variable z, with coefficients in A. The next lemmas describe its 
radius of convergence more precisely. 


Lemma 1.5. Let A be a normed algebra. Let x € A. Then 


lim |x"! 


noo 


exists, and is equal to inf |x"|*/". 


Proof. Without loss of generality, we may assume that x” 40 for all 
positive integers n. Let c, = |x"| > 0. We obviously have 


Cmtn S CmCn 


for all positive integers m, n. We put co = 1. Fix m. Then any integer n 
can be written in the form n = q(n)m + r(n) with 0 < r(n) < m. Then 


cl < almnch in 
But r(n) is bounded by m, and lim q(n)/n = 1/m. Hence 


lim sup ci" < cll™, 


This inequality holds for all positive integers m, and hence 


lim sup c,/ < inf c;’ < lim inf c}"". 


no n no 


This proves the lemma. 
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In light of the lemma, we define the spectral radius 
p(x) = lim|x"|"”. 


The reason for the name will become clear from Theorem 1.7 below. 
We are now ready to consider the power series 


h,(z) = » x"z" 


with x"z"eA for all complex numbers z. We are interested in the do- 
main of convergence of the series. 


Lemma 1.6. Suppose A is a Banach algebra with unit. Then the radius 
of convergence of the series h,, is 1/p(x). 


Proof. This is immediate from the usual arguments in the theory of a 
complex variable, which show in the present context that the radius of 
convergence is 1/lim sup|x"|'/”. We can then apply Lemma 1.5. 


In Theorem 1.1 we saw that the spectrum o(x) is compact and con- 
tained in the disc |z| < |x|. We now prove a more precise inequality. 


Theorem 1.7. Let A be a Banach algebra with unit e #0. Then 


p(x) = sup |z|. 


zeéa(x) 
Proof. Suppose |z| > p(x). Let 
p(x)<r<|z|. 


Then |(z~*x)"| < (r/|z|)" for large n, so the series ) (z~*x)" converges and 
shows that z¢o(x). Conversely, let s = sup|z| for zeo(x), so s is the 
smallest radius for a closed disc centered at the origin and containing the 
spectrum a(x). Suppose weC and |w|>s. Then w ¢o(x), so x — we is 
invertible. Thus the function 


h(z) = (e — zx)? 


is analytic on the disc |z| <s~*. As we remarked earlier in this section, 


this implies that the radius of convergence of the power series 


y xn" 
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is = s~‘, and therefore 
s* < p(x)" whence (x) Ss. 


This concludes the proof of Theorem 1.7. 


XVI, §2. THE GELFAND TRANSFORM 


Throughout, we let A be a commutative Banach algebra with unit ele- 
ment e #0, over the complex numbers. 


We wish to represent elements of A as continuous functions on a certain 
space, whose points are going to be the maximal ideals of A. We do this 
in a sequence of lemmas. 


Lemma 2.1. Let M be a maximal ideal of A. Then M is closed. 


Proof. By definition, M #4 A and so e¢ M. By Theorem 2.1 of Chap- 
ter IV, there is an open neighborhood U of e consisting of invertible 
elements, and U is contained in the complement of M, so U is contained 
in the complement of the closure M, thus proving the lemma. 


Lemma 2.2. An element ue A is invertible if and only if u is not 
contained in any maximal ideal. 


Proof. It is clear that an invertible element is not contained in a 
maximal ideal (which, by definition, is # A). Conversely, suppose wu is 
not contained in any maximal ideal. Let J = Au be the ideal generated 
by u. If J = A, then u is invertible. If J # A, then by Zorn’s lemma, J is 
contained in a maximal ideal M. Indeed, let S be the set of ideals J 
containing J and # A. Then S is ordered by inclusion, and S is induc- 
tively ordered, for let {I,} be a totally ordered subset. Let I =|) I,. 
Then it is immediately verified that J is an ideal, and I # A, for other- 
wise I contains the unit element, and this unit element must therefore lie 
in some ideal [,, contrary to the assumption that I, # A for all k. This 
shows that if J 4 A then u is contained in some maximal ideal, a contra- 
diction which proves the lemma. 


Let 4: A—C be a functional, ie. a continuous linear map into the 
scalars C. If in addition A satisfies the multiplicative conditions 


A(xy) = A(x)A(y) and A(e) = 1, 


then we call 1 a character of A. Thus a character is a Banach-algebra 
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homomorphism (rather than a Banach space homomorphism) into the 
scalars C. The kernel of such a character is a maximal ideal, because 
its image is the field C. Conversely, by the theorem of Gelfand—Mazur 
(Corollary 1.3) if M is a maximal ideal, then A/M is naturally isomorphic 
to C, and therefore we obtain a character 


Au: A>C 


which to each element x € A associates its residue class x + M in A/M = 
C. We then obtain: 


Proposition 2.3. The association Mt>i,, is a bijection between the set 
of maximal ideals and the set of characters of A. For each character A, 
we have |A| < 1. 


Proof. The first statement has been proved. As to the second, let us 
first prove that if |x| <1 then |A(x)| <1. If |A(x)| > 1, then |A(x”)| > 0 
as noo, but |x"| <1, which contradicts the continuity of A. Thus 
|A(x)| S 1. Then for any x #0, let y = x/|x| so |y| =1. Then |A(y)| S$ 1, 
so |A| < 1, thus proving the proposition. 


In light of the fact that A=41,, for some M, we see that we may 
rephrase Proposition 2.3 as follows. If xéA and ceEC is such that 
x =c mod M, and if |x| < 1, then |c| < 1. 


The Banach algebra A being a Banach space has its dual A’. But it 
also has the set of characters, which we denote by A, and Ac A'’,, where 

, is the unit ball in A’. In Chapter IV, §1 we defined the weak 
topology on A‘, by embedding A‘, in a product space 


f:A,—o [] K,< [] Cc. 


jx|<1 xeA 


In particular, we obtain the weak topology on A because as we have just 
seen, we have an inclusion of A in the unit ball of A’: 


Ac A}. 


Let .@ be the set of maximal ideals of A. In light of the bijection of @ 
with A, we have a natural embedding (the restriction of f) 


g. Aor ads ]J]K, by = sérPfg,a), 


|x{S1 


where for each xe A and 1€ A we have g,(A) = A(x). 
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Theorem 2.4. The image g(A) is closed and therefore A is compact in 
the weak topology. 


Proof. The argument is basically the same as the argument used to 
prove Theorem 1.4 of Chapter IV. If y is in the closure of g(A), then one 
proves as before by a continuity argument that y satisfies the condition 


y(xy) = y(x)y(y), 


just as we proved y(x + y) = y(x) + y(y). We leave the routine to the 
reader. 


Since each function g, (x € A) is continuous on .@ x A by definition of 
the weak topology, the map 


X>g, 


is a mapping from A to C(.W,C)x C(A,C). It is immediately verified 
that this map is a ring homomorphism, 1.e. 


Gxty = 9x + Gy and Ixy = Ix9Jy- 


Also g., = cg, for ce C, so we say that xg, 1s an algebra homomor- 
phism. Thus we have represented the elements of A as continuous func- 
tions on the maximal ideal space, or on the character space depending 
on which notion one wishes to emphasize. This representation is called 
the Gelfand transform. The kernel of the homomorphism 


A—>C(M, C) = C(A, C) 


is obviously the intersection of all maximal ideals, which is called the 
radical of A. For general Banach algebras, one cannot say anything 
further, but in the next section, we shall give a criterion when this kernel 
is 0. In such a case, A “is” the algebra of continuous functions on a 
compact space, obtained from A in a natural way. 


XVI, §3. C*-ALGEBRAS 


Let A be a complex Banach algebra. By an involution on A we mean a 
map x++x* of A into itself satisfying: 


(x + y)*¥ = x* + y*, (ax)* = ax*, 


(xy)* = y*x*, x¥* = x, 
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If A has a unit element e, then the reader will verify at once that e* =e. 
If further x is invertible, then (x~1)* = (x*)™*. 

By a C*-algebra, we mean a Banach algebra with an involution as 
above, satisfying the additional condition 


Ix*x] = |x|? for all xe A. 


We suppose A is a C*-algebra for the rest of this section. Observe that 
from the defining condition, we get 


|x|? S|x*||x| whence — |x| < |x*|. 
Since x = x**, it follows that 
|x| = |x*]. 


Examples. The standard example of a commutative C*-algebra is that 
of the algebra of continuous functions on a compact Hausdorff space, 
with the sup norm. The involution is given by the complex conjugate, 
f* = f. Theorem 3.3 below shows that under certain conditions, there is 
no other. 

Let H be a Hilbert space, and let A = End(H) be the algebra of 
bounded linear maps of H into itself (operators). Then A is a C*-algebra, 
the star operation being the adjoint. In Chapter XVIII, we shall consider 
the commutative subalgebra generated by one hermitian operator and 
reprove the basic theorem independently of the result in this section. 

For another example of an involution (which does not satisfy the 
condition of a C*-algebra), see Exercise 9. 


We have assumed the existence of a unit element in the algebra A for 
simplicity. When there is no unit element a priori, one may embed A in 
another algebra with unit, i.e. adjoin a unit element, and thereby reduce 
the study of an involution to the case when a unit element is present. 
We leave this construction as Exercises 5 through 8. 


Proposition 3.1. Let A be a C*-algebra with unit. Let xe A. If x = x* 
then the spectrum o(x) is real and p(x) = |x|. 


Proof. Let z€o(x). For all real t it follows that z + it € o(x + ite). By 
Theorem 1.7 we obtain: 
|z + it|? < |x + ite|* < |(x + ite)(x — ite)| 
< |x? + t7e| 


< |x|? + t? 
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using the condition of a C*-algebra, and the convention that |e| = 1. 
Write z =a + ib with a, b real. Then for all real t we find 


a? + b? + 2bt < |x\’, 
which is possible only if b = 0, so z is real, and the spectrum is real. 
As to the statement about the spectral radius, from the condition 
defining a C*-algebra, we have 


|x|?" = |x?"|, 


and we simply use the definition of the spectral radius to conclude the 
proof. 


Proposition 3.2. Let A be a commutative C*-algebra with unit. Let 4 be 
a character. Then A(x*) = A(x) for all x € A. 


Proof. If x = x* then the assertion follows from Proposition 3.1 be- 
cause A(x) is in the spectrum of x. For x arbitrary, we decompose x into 


a Sum 


1 
x=u-—iv — with u=3(x+x*) and v = 5-(x — x*), 


Then u, v satisfy u = u* and v = v*. Furthermore 
A(x*) = A(u — iv) = A(u) — iA(v) = A(u) + iA(v) = A(X), 
which proves the proposition. 
We now come to the main structure theorem. 
Theorem 3.3 (Gelfand—Naimark). Let A be a commutative C*-algebra 
with unit. Then the intersection of all maximal ideals of A is 0, and the 
map xt> f, gives a norm-preserving isomorphism of A on C(@, C). 
Proof. It follows as an immediate consequence of the Stone—Weierstrass 
theorem that the image g(A) in C(.@) is dense. Since A is assumed 


complete, it will suffice to prove that the map g is norm-preserving. Let 
x € A and write g(x) instead of g,. Then by Proposition 3.2, 


g(x*x) = g(x*)g(x) = g(x) g(x) = lg(x)/’. 


But the element y = x*x satisfies y = y*. By Theorem 1.7 and Proposi- 
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tion 3.1, we find 


lg(y)| = p(y) = Iyl, 


so |g(x)| = |x| by the definition of a C*-algebra, thus concluding the 
proof. 


XVI, §4. EXERCISES 


1. Let A be a complex Banach algebra with unit element, and let ue A. Let 
a(u) be the spectrum of u. Let p be a polynomial with complex coefficients. 
Show that the spectrum of p(u) is equal to p(o(u)), i.e. to the set of all 
numbers p(«), where « lies in the spectrum of u. [Hint: For one inclusion 
write 


p(t) — p(a) = (t — a)q(t) 


for some polynomial gq, and for the other, write 


p(t) — a = (t — a) -*- (t — a) 


if « is in the spectrum of p(u).] Of course, the result applies especially if u is 
an operator on a Banach space E. 


2. Let E be a Banach space and let F, G be two closed subspaces such that 
E = F @G is their direct sum. Let A be an operator on E and assume that 
F, G are A-invariant. Let A; and Ag denote the restrictions of A to F and G 
respectively. Let o(A) denote the spectrum of A. 

(a) Let « be a complex number. Show that A — al is invertible if and only if 
A, — al, and A, — alg are invertible. 
(b) Show that 


a(A) = o(A;) Ua(Ag). 
3. Let A be a complex Banach algebra with unit element e and let ue A. Show 
that the map 
zH>(u — ze) 
is (complex) differentiable and analytic on the complement of the spectrum of 
u. One calls R(u, z) = (u — ze)" the resolvent of u. 
Run through systematically some of the basic theorems of complex anal- 


ysis and prove them in the Banach-valued case if they are true. (Liouville’s 
theorem, Cauchy’s theorem, etc.) 


4. Let w be the function defined for real t by 
W(t) =e", 


so that w can be viewed as a function on the unit circle. Let A be the set of 
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all functions which can be written as infinite sums 


f= Yeu" 


with c, complex, satisfying 


[.@) 


y. [Cal < 00. 


For the norm defined by 
IF ll => lenl, 


show that A is a Banach algebra under ordinary addition and multiplication 
of functions. Prove that if f(t) 40 for all t, then f is invertible in A. [Hint: 
It suffices to prove that for any character 4 of A we have A(f) #0. Show 
that |A(w)| = 1 so that A(w) = W(t.) for some ty. ] 


5. Adjunction of a unit element. Let A be a normed algebra. We embed A in a 
normed algebra with unit as follows. Let A=C x A, with addition and 
multiplication defined componentwise. Define 


\(z, x)| = |z| + |x| for zeéCand xéA. 


Prove that A is a normed algebra, with unit (1, 0), and containing an isomor- 
phic image of A as the subset of elements (0, x) with xe A. Show that A is 
an ideal in A. Warning: If A happened to have a unit element, then this unit 
is not the same one as the unit in A. 


6. Suppose A is a C*-algebra but without our assuming that A has a unit 
element. Prove that there exists on A a norm extending the norm on A 
which makes A into a C*-algebra with unit. (Warning: It is not the norm of 
Exercise 5.) [Hint: Observe that for x € A, we have 


|xy| 
|x| = sup ——- 


yea IYI 


and so for xe A define |x| by the same formula. Since A is an ideal in A, it 
follows that xyeé A for xe A and yeA. Prove that this definition gives a 
norm on A, and that this norm satisfies the condition for a C*-algebra. | 


7. Let A be a Banach algebra with an involution, and let B be a C*-algebra. 
Let h: A—B be an algebraic homomorphism (no condition is put with re- 
spect with the norm) such that h(x*) = h(x)*. Prove that 


| h(x)| < |x| for all xeéA. 


8. Using Exercise 7, prove that the norm on A defined in Exercise 6 is the 
unique norm extending the norm on A, and making A into a C*-algebra. 
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9. With the same notation as Exercise 10 of Chapter XII, if me M'‘(G), define 


Show that this star operation is an involution, and that ||m|| = ||m*||. 


10. Let A be a C*-algebra with unit, and let x € A be such that x* = x~'. Show 
that the spectrum of x lies on the unit circle. 


11. Let A be a Banach algebra with unit e. Let ve A. Show that there exists a 
unique C® mapping K:R — A such that: 
d 
H 1. We have it K(t) = K(t)v. 
H 2. K(0) =e. 


Show that the image of K lies in the multiplicative group of invertible ele- 
ments of A. 


CHAPTER XVII 


Compact and Fredholm 
Operators 


The operators in infinite dimensional spaces closest to operators in finite 
dimensional spaces are the compact operators, which will now be studied 
systematically. A large number of examples of compact operators are 
given in the exercises. 


XVII, §1. COMPACT OPERATORS 


We recall that a subset of a topological space is said to be relatively 
compact if its closure is compact. We had proved a convenient criterion 
for this (Corollary 3.9 of Chapter ID), namely: 


Let X be a subset of a complete normed vector space. Assume that 
given r>O there exists a finite covering of X by balls of radius r. 
Then X is relatively compact. 


This criterion will be used frequently in this chapter. 
Let E, F be normed vector spaces (not necessarily complete) and let 
u:E—>F 
be a linear map. We say that u is compact if uv maps bounded sets in E 
into relatively compact sets in F. Equivalently, we can say that u maps 
the unit ball in E into a relatively compact set in F. It is then clear that 


u must be continuous, because if B is the unit ball in E, then u(B) has 
compact closure, whence is bounded. It is also clear that our definition 
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is equivalent to saying that if {x,} is a bounded sequence in E, then 
{ux,} has a convergent subsequence. 


Examples. If E or F is finite dimensional, then u is compact. Conse- 
quently, if just the image of u is finite dimensional, then u is compact. 

Since a locally compact Banach space is finite dimensional, by Corol- 
lary 3.15 of Chapter II, it follows that the identity map of an infinite 
dimensional Banach space is not compact. 


In a later section, we shall prove that the following type of operator is 
compact. Let K(x, y) be a continuous function on the rectangle a<x <b 
andc syd. If f is continuous on [a, b], we define 


b 


Sf(y) = | K(x, y)f(x) dx. 


a 


It will be shown later that S is compact. Thus our theory applies to the 
study of this type of integral equation. 


We denote by K(E, F) the set of compact linear maps of E into F. 


Theorem 1.1. The compact linear mappings from E to F form a vector 
space. If F is complete, then K(E, F) is a closed subspace of L(E, F). 


Proof. If X, Y are compact in F then X + Y is compact, being the 
continuous image of the compact set X x Y under the map (x, y)h>x + y. 
If B is the unit ball in E, then it follows that for u, ve K(E, F) the set 
u(B) + v(B) is compact. But then 


u(B) + v(B) < u(B) + v(B). 


Since u(cB) = cu(B) for any scalar c, it follows that K(E, F) is a vector 
space. To show it is closed in L(E, F) when F is complete, let u be in its 
closure. It will suffice to prove that u(B) is covered by a finite number of 
open balls of given radius r. Let ve€ K(E, F) be such that |u — v| < 7/2. 
Since v is compact, we know that v(B) is covered by a finite number of 
open balls of radius r/2, centered say at points y,,...,y,. For each xe B 
we then have 


|u(x) — v(x)| < r/2 and |v(x) — y,;| < r/2 


for some i. This implies that |u(x) — y,|<r, and hence that u(B) is 
covered by a finite number of balls of radius r, as was to be shown. 


Remark 1. Let F be a Banach space. It follows from Theorem 1 that 
if {u,} is a sequence of elements of L(E, F) such that the image of u,, is 
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finite dimensional for all n, and if {u,} converges to an element u of 
L(E, F), then u is compact. It is not known, however, if a compact 
operator can always be expressed as the limit of such a sequence. It does 
hold for compact operators in Hilbert space. 


Remark 2. We gave the definition of compact mappings on spaces 
which are not necessarily complete. Note that if u: E-F is compact, 
and if E, F denote the completions of E and F respectively, then the 
linear continuous extension 


i: E->F 


is also compact. This is immediate. Furthermore, if E, is any subspace 
of E and F, > u(E,), then the restriction 


ulE,: E, > F, 
is also compact. 


Theorem 1.2. Let E, F, G, H be normed vector spaces and let 
f: E-F, u: FG, g.G->H 


be continuous linear maps. If u is compact then uo f and gou are. 
compact. In particular, K(E, E) is a two-sided ideal of L(E, E). 


Proof. The first relation follows from the fact that a continuous image 
of a compact set is compact. The second is obvious. The third comes 
from the definitions. 


Theorem 1.3. Let E, F be Banach spaces and u: E- F a compact linear 
map. Then u’: F' > E' is compact. 


Proof. One can give a direct simple proof, but the reader will note 
that our assertion is an immediate consequence of the Ascoli theorem. 
We shall make no use of Theorem 1.3 in this book, and thus we leave 
the details to the reader. 


XVIi, §2. FREDHOLM OPERATORS AND THE INDEX 
Let E, F be normed vector spaces. A continuous linear map 


T:E-F 
is said to be Fredholm if: 


(i) Ker T is finite dimensional. 
(11) Im T 1s closed and finite codimensional. 
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Example (The Shift Operator). Let E be the Hilbert space of all 
sequences a = {a,} such that ) |a,|* converges. (This is essentially the 
space of Fourier series.) We define 


T:E>E 
by 
Ta = (a,, a3,...) 


if « =(a,,a,,...). The kernel of T is 1-dimensional, and T is surjective. 
There are variants of this operator, for instance the operator such that 


(a,,a5,...)>(0, a,, a5,...), 


which has 0 kernel, and whose image has codimension 1. 

We shall show in Theorem 2.1 that if u is compact, then I — wu is 
Fredholm. In a later section, we shall give other examples of Fredholm 
operators, as integral or differential operators. The reader may look at 
these now to see the concrete applications of our algebra to analysis. 

We shall use constantly the corollaries of Theorem 1.3, Chapter XV, 
which the reader is advised to review carefully. The results expressed 
in these corollaries, most of which depend on the open mapping theo- 
rem, will be quoted without further specific reference. We note in par- 
ticular that as a consequence of these corollaries, when E, F are Banach 
spaces, then the hypothesis for Fredholm maps T that Im T is closed 
follows from the finite codimensionality, and could thus be omitted from 
the definition of a Fredholm map in this case. 

We shall also use the fact that a finite dimensional subspace of a 
Banach space admits a closed complement. This was an exercise using 
the Hahn-—Banach theorem. 


Theorem 2.1. Let E be a Banach space, and u: E— E a compact opera- 
tor. Then I —u is Fredholm. 


Proof. The identity I restricted to the kernel of J —u is equal to u, 
and is consequently compact. Hence this kernel is finite dimensional, 
because a locally compact normed vector space is finite dimensional. 

Now we show that the image of I — u is closed. Let T=I—u. Let G 
be a closed complement for Ker T; so that 

E = Ker(U — u) @G. 
We obtain continuous linear maps 


T|G:G>E and ulG:G— E, 


the restrictions of T and u to G. Furthermore, the kernel of T|G is {0}. 


[XVIT, §2] FREDHOLM OPERATORS AND THE INDEX 419 


It will suffice to prove that TG = TE is closed, and for this it will suffice 
to prove that the inverse map 


(T|G): TE > G 


is continuous. (Indeed, in that case, TG is complete, so closed.) It even 
suffices to prove that (T|G)™ is continuous at 0, by linearity. Suppose 
that this is not the case. Then we can find a sequence {x,} in G such 
that Tx,—70, but {x,} does not converge to 0. Selecting a suitable 
subsequence, we can assume without loss of generality that |x,| 2r>0 
for all n. Then 1/|x,| <1/r for all n, and consequently T(x,/|x,|) also 
converges to 0. Furthermore, x,/|x,| has norm 1, and hence some sub- 


( 
u 


Xn Xn Xn 
(3-8-8 
IXnlJ [Xn IXn| 
it follows that a subsequence of {x,/|x,|} converges to some element z in G, 
also having norm 1. But then 0 = z — u(z), and Tz = 0. This contradicts 
the fact that G7 Ker T = {0}, and thus proves that TE = TG is closed. 
Finally we have to show that TE has finite codimension. We shall 


need the following lemma, which will also be used later in the spectral 
theorem. 


converges. Since 


Lemma 2.2. Given ¢. Let F be a closed subspace of a normed vector 
space H, and assume that F # H. Then there exists x € H with |x| = 1 
such that 

d(x, F) = inf |x —y|21-—e. 


yeF 


Proof. Let ze H and z€F. Select y, € F such that 


IZ—Yol S (int |Z -yi)a + é). 
yeF 


We let 
xa 70 
|Z — Yo| 
Then for y€ F we have 
_|2—Yo _l2—Yo-l2—Yol¥l, 12 — Vol 
Ix —yl = — YS ee ETT 
IZ — Yo| IZ — Yo| IZ — yol(1 + 8) 


which proves our lemma. 
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To apply the lemma, suppose that TE does not have finite codimen- 
sion. We can find a sequence of closed subspaces 


TE=H,cH,¢:°::cH,¢°"*: 


such that each H, is closed and of codimension 1 in H,,, just by adding 
one-dimensional spaces to TE inductively. By the lemma, we can find in 
each H, an element x, such that |x,|=1 and |x, —y|21-—e for all 
yéH,_,. Then for all k <n: 


lux, — ux,| = |x, — Tx, — xX, + Tx;,| 


=>1l—eé 


because — Tx, — x, + Tx, lies in H,_,. This shows that the sequence 
{ux,} cannot have a convergent subsequence, and contradicts the com- 
pactness of u, thus proving Theorem 2.1. 


We denote by Fred(E, F) the set of Fredholm operators from E into 
F. If T € Fred(E, F), then we define the index of T to be 


ind T = dim Ker T — dim F/TE. 


In the language of linear algebra, the factor space F/TE is also called the 
cokernel of 7, and thus 


ind T = dim Ker T — dim coker T. 


Theorem 2.3. Let E, F be Banach spaces. Then Fred(E, F) is open in 
L(E, F), and the function Ttoind T is continuous on Fred(E, F), hence 
constant on connected components. 


Proof. Let S: E-—F be a Fredholm operator. We wish to prove that 
if Te L(E, F) is close to S, then T itself is Fredholm. Let N be the 
kernel of S, and let G be a closed complement for N, that is E=N@G. 
Then S induces a toplinear isomorphism of G on its image SG (by the 
open mapping theorem), and we can write F = SG@H for some finite 
dimensional subspace H. The map 


Gx H-~SG@H=F 
given by 
(x, y)F> SX + y 


is a toplinear isomorphism. We know that the set of toplinear isomor- 
phisms of one Banach space into another is open in the space of all 
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continuous linear maps. If T is close to S, then the map 


GxH-TG@H=F 
given by 
(x, y)r> Tx + y 


is therefore also a toplinear isomorphism. Hence the kernel of T is finite 
dimensional, since Gn Ker T = {0}, and in fact dim Ker T is at most 
equal to the codimension of G in E. The image of T has finite codimen- 
sion (at most equal to the dimension of H), and is consequently closed, 
by Corollary 1.8 of Chapter XV. This proves that T is Fredholm, and 
proves our first assertion. 

Now concerning the index, we observe that G @ Ker T is a direct sum 
of two closed subspaces, and there is some finite dimensional subspace M 
such that 

E=G@KerTOM. 


Then T induces a toplinear isomorphism 


G@®M-T(G@®M)=TGOTM, 
and 
dim M = dim TM. 
Hence we get 
ind T = dim Ker T — (dim H — dim TM) 
= dim Ker T + dim M — dim H 
= dim Ker S — dim H 
= ind S. 
This proves our theorem. 
Corollary 2.4. Let E be a Banach space and u a compact operator on 
E. If I—u is injective (i.e. Ker I — u = {0}), then I —u is a toplinear 


automorphism. 


Proof. For each real t, the operator tu is compact. The map tt tu is 
continuous, and so is the map 


treind(I — tu). 
Hence this map is constant. Letting t = 0 and t = 1 shows that 


ind(I — u) = 0. 
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Hence I — u is surjective, whence a toplinear isomorphism by the open 
mapping theorem. 


Note. Examples of compact operators u furnish immediately exam- 
ples of Fredholm operators I—u. For other examples, cf. for instance 
Smale’s paper [Sm 3]. One obtains Fredholm linear maps by taking the 
derivative of certain “Fredholm” non-linear maps, which are of interest in 
differential equations. Thus one sees the linearization provided by the 
derivative as a first step in analyzing non-linear problems. 


Let T, S be continuous linear maps E—F. We say that T is congru- 
ent to S modulo compact operators if T — S is compact, and we write 


T = S mod K(E, F). 


This congruence is an equivalence relation, and if T = S, T, = S,, then 
TT, = SS,. This is immediately verified as a consequence of Theorem 
1.2. Of course, the composition TT, (or SS,) must make sense. It means 
that we compose T,: E, ~ E with T as above, and similarly with SS,. 
Similar congruence statements hold for sums. 

We say that T: E- F is invertible modulo compact operators if there 
exists a continuous linear map T,: F > E such that 


TT, = 1, mod K(F, F) and T, T = 1, mod K(E, E). 
Thus we call T, an inverse of T modulo compact operators. 
Theorem 2.5. Let E, F be Banach spaces and let T: E- F be a contin- 
uous linear map. Then T is Fredholm if and only if T is invertible 
modulo compact operators K(E,F). We can select an inverse of T 
modulo compact operators, having finite codimensional image. 
Proof. Let T be Fredholm, and write direct sum decompositions 
E = Ker T@G, F=ImTOdH 
with closed subspaces G, H. We let S be the composite 
F=ImT@H>ImT>G'SE 
where pr is the production, and inc. is the inclusion. Then I, — TS is the 
projection on H, and I, — ST is the projection on Ker T: This proves 
that T has an inverse modulo compact operators. Conversely, suppose 


that S is such an inverse; then we have 


Ker T c Ker ST, 
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so T has finite dimensional kernel. Also 
Im T > Im TS 


and TS has a closed image of finite codimension by Theorem 2.1. Hence 
Im T has a closed image of finite codimension, so that T is Fredholm. 
This proves our theorem. 


Note. As an exercise, prove the usual uniqueness of an inverse, that 1s, 
suppose that there exist continuous linear maps T,, T, such that 


TT, =I, mod K(F, F) 
and 
T, T = I, mod K(E, E). 


Show that T, = T, mod K(F, E), and that T, or T, is thus an inverse for 
T modulo compact operators. 


Corollary 2.6. The composite of Fredholm maps is Fredholm. If T is 
Fredholm and u is compact, then T + u is Fredholm. 


Proof. Clear. 
Corollary 2.7. If T is Fredholm and u is compact, then 
ind(T + u) = ind T- 


Proof. The same proof works as for the corollary of Theorem 2.3, 
namely we connect T + u with T by the segment. 


T + tu, O<t<l. 


The next theorem will not be used later in a significant way and thus 
its proof can be omitted if the reader is allergic to formal algebra. 


Theorem 2.8. Let E, F, G be Banach spaces, and let 
S:E—>F and T: F ->G 


be Fredholm. Then 
ind TS = ind T + ind S. 


Proof. To do this proof properly, we need an algebraic lemma. Let V 
be a vector space and W a subspace. Let 


f:V—>fV) 
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be a linear map, with image f(V), which we also write fV for simplicity 
of notation. If the factor space V/W is finite dimensional, we denote by 
(V: W) the dimension of the factor space V/W. We denote by V, the 
kernel of f in V, and by W, the kernel of f in W, that is Wo V,. 


Lemma 2.9. Let V be a vector space and W a subspace. Let f: V7 fV 
be a linear map. Then 


(V: W) = (fV: fW) + (VW) 


in the sense that if two of these induces are finite, then so is the third, 
and the stated relation holds. 


Proof. Consider the composite of linear maps 
V > f{V > fV/fW. 


The kernel certainly contains V-+W. If xeV lies in the kernel, this 
means that there exists some ye W such that f(x) = f(y), and then 
f(x — y) =0, so x — y lies in V,. Hence the kernel is precisely equal to 
V, + W. Hence we obtain an isomorphism 


(1) V/V, + W) > fV/fW. 
We have inclusions of subspaces 

(2) WeoVe+We. 
We consider the linear map 


given by 
xt>class of x modulo W. 


An element of the kernel is such that it also lies in W, so that we obtain 
an isomorphism 


(3) V,/W, > (V,; + W)/W. 


From this we see at once that if two of our indices are finite, so is 
the third. Indeed, suppose that (V: W) and (fV: fW) are finite. Then 
(V:V,+W) is finite by (1) and hence (V,:W,) is finite by (3). The 
others are proved similarly. As for the relation concerning the dimen- 
sions, we see that whenever our indices are finite, then 


(4) (V:W)=V:V,+W)+(V,+W:W). 
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If we now use (1) and (3), we get the relation stated in the lemma, as was 
to be shown. 
We return to the proof of Theorem 2.8. 
For simplicity of notation, we use our notation E, for the kernel of S 
in E. We write down the definitions of the index for T, S and TS: 
ind S = (E,: 0) — (F: SE), 
ind T = (F,: 0) —(G: TF), 
ind TS = (E;,: 0) — (G: TSE). 


We have the following inclusions: 
{0} c Ker Sc Ker TS 


because Sx = 0 implies TSx = 0, and also 


TSECTF CG. 
Hence 

(3) (Ezs : 0) = (Ezs : Es) + (Es : 0) 
and 

(6) (G : TSE) = (G: TF) + (TF : TSE). 


We apply our lemma to the spaces SE c F, and to the map T. We then 
get 


(7) (F : SE) =(TF : TSE) + (Fp: SEQ Fr). 


From the inclusions 
{0} c SEF; c Fy 
we obtain 
(8) (F, : 0) = (Fp: SEO F;) + (SEA Fry : 9). 
If we now substitute the values of (5), (6), (7), (8) into the expression for 
ind TS — ind T — ind S, 


we obtain 
(Eys: Es) — (SE A F, : 9), 


and we have to show that this is equal to 0. Let us write E7, as a direct 
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Sum 
Ers = Es ) W 


for some finite dimensional W. Then we can write £— as a direct sum 


E=E;;Q@®U=E,® W@® U. 
Then 
SE = S(W@U) =SW@SU. 


We contend that SEQ F; =SW. Indeed, it is clear that SW c F;, and 
conversely, if ye SE, y= Sx and TSx = 0, then x = x, + x, with x, € Ers 
and x, € W, and Sx = Sx, e SW whence SEO F; = SW. But then 


(E7s: Es) = (W: 0) = (SW: 0) 


because S is an isomorphism on W. This concludes the proof of our 
theorem. 


XVII, §3. SPECTRAL THEOREM FOR 
COMPACT OPERATORS 


Throughout this section, we let E be a Banach space, and let u: E> E be a 
compact operator. 


We are interested in the spectrum of u. We recall that a number « is 
called an eigenvalue for u if there exists a non-zero vector x € E such that 


ux = OX. 


In that case we call x an eigenvector for u, belonging to «a. 

If « is a number #0, then au is compact and so is « ‘u. Hence 
u—al and of —u are Fredholm, by Theorem 2.1. Furthermore, for 
every positive integer n, the operator (J — u)” can be written as 


I—u)y"=I-u, 


for some compact u,, because we expand with the binomial expansion, 
and use Theorem 1.2. Hence (u — aJ)" is also Fredholm for a + 0. 
By Corollary 2.4, we know that for a 4 0, 
ind(u — aI)" = 0, 


in other words, 


dim Ker(u — «J)" = dim coker(u — aI)". 
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Theorem 3.1. Let a be a number # 0. Either u — al is invertible, or a 
is an eigenvalue of u. In other words, every element of the spectrum of 
u is an eigenvalue, except possibly 0. If E is infinite dimensional, then 0 
is in the spectrum. 


Proof. By Corollary 2.4 we know that if u— al has kernel {0}, then 
u—al is invertible. Thus our first statement is essentially merely a re- 
formulation of this corollary. If 0 is not in the spectrum, then wu is 
invertible, and then the image of the closed unit ball by u is homeomor- 
phic to this unit ball and is compact, so that E is locally compact, and 
hence finite dimensional, thereby proving Theorem 3.1. 


In the theory of a finite dimensional vector space V, with an endomor- 
phism u: V > V, one knows that we can decompose V into a direct sum 


V=N,®@-- ON, 


such that each N, corresponds to an eigenvalue «,; of u, and such that for 
each i, there exists an integer r; having the property that 


(u — a,J)"iN, = 0. 


As one says, u — a,J is nilpotent on N,. When that is the case, a theorem 
like the Jordan normal form theorem gives a canonical matrix represent- 
ing u with respect to a suitable basis, namely blocks consisting of trian- 
gular matrices of type 


a; 1 0 vee 0 
0 a 1 0 
000-7. 1 
000 - «, 


Thus the decomposition of V into subspaces as above yields complete 
information concerning u. We shall now do this for compact operators. 
Of course, we get infinitely many subspaces in the decomposition, corre- 
sponding to possibly infinitely many eigenvalues. 


Lemma 3.2. Let « be a non-zero eigenvalue for u. Then there exists an 
integer r > 0 such that 


Ker(u — al)’ = Ker(u — al)" 


for alln 2 r. 
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Proof. It suffices to prove that Ker(IJ — u)’ = Ker(J —u)", for all 
n=some r. Suppose that this is not the case. Then we have a strictly 
ascending chain of subspaces 


Ker(I — u) & Ker(I — u)? & -:- | Ker(I —u)"=---. 
By Lemma 2.2, we can find an element 
x, € Ker( — u)" 


such that |x,|=1 and x, is at distance = 1 —e from Ker(I —u)""*. Let 
T =I -—u. Then just as in this lemma, we find for k <n: 


lux, — ux,| = |x, — Tx, — X, + Tx,| 


=z=1l—eé 


because Tx, lies in Ker(J — u)""'. This contradicts the compactness of u 
and proves the lemma. 
It is clear that if Ker(u — oJ)’ = Ker(u — «J)'*’ for some r, then 


Ker(u — al)’ = Ker(u — al)” 


for all n=r. We call the smallest integer r for which this is true the 
exponent of a. 


Theorem 3.3. Let « be a non-zero eigenvalue of u, and let r be its 
exponent. Then we have a direct sum decomposition 


E = Ker(u — al)’ © Im(u — aly, 


and each of the spaces occurring in this direct sum is a closed invariant 
subspace of u. If B 4 is another non-zero eigenvalue of u, and s is its 
exponent, then 


Ker(u — BI)’ < Im(u — al)’. 


Proof. Let T =(u— al)’. Then T is Fredholm. Both Ker T and Im T 
are u-invariant closed subspaces. Furthermore, we have 


Ker Ta Im T = {0}. 


Indeed, suppose that xe Ker TX ImT. We can write x = Ty for some 
yeE. Since Tx = 0 we get T*y = (u — al)””"y = 0. Since 


Ker T = Ker T? 
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by the lemma, we conclude that ye Ker T, and therefore x = 0. Finally, 
since the index of T is equal to 0, it follows that codim Im T = dim Ker T. 
Since we have already a direct sum decomposition Ker T@Im T for 
some subspace of E, it follows that this subspace must be all of E. This 
proves our first assertion. Let now 6B #«a be another eigenvalue for u, 
and let S = (u — BI)’. Then ST = TS, so Ker T and Im T are S-invariant 
subspaces. Let x € Ker S. We can write 


x=ytz 
uniquely with ye Ker T and ze Im T. Then 
0 = Sx = Sy + Sz, 


and since Sye Ker T, SzeImT, it follows from the uniqueness of the 
decomposition that Sy=0. But S, T are obtained as relatively prime 
polynomials in u, and hence there exist polynomials in u, namely P and 
QO, such that 


PS+OT=I. 


(We recall the proof below.) Applying this to y shows that Iy = 0 so that 
y = 0 and hence x = zeIm T, thus proving our theorem. 


Now to recall the proof of the existence of P, Q, let A =u— al and 
B=u-— BI. There exist constants a, b such that 


aA+ bB= I. 


We take n sufficiently large, and raise both sides to the n-th power. We 
obtain 


¥' ¢,(aA)(bBy = 1. 
j=0 


If we take n sufficiently large, then j 2r or n—j 2s, and thus the 
existence of P, Q follows as desired. 


Theorem 3.4. Assume that there are infinitely many eigenvalues. Then 
the eigenvalues # 0 of u form a denumerable set, and if we order them 
AS &,, &,... such that 


[os] 2 lope, 
then 
lim X; = 0. 


io 
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Proof. Given c > 0 we first show that there is only a finite number of 
eigenvalues « such that |«| 2c. If this is not true, then we can find a 
sequence of eigenvectors {w,} belonging to distinct eigenvalues {a,} such 
that |w,| = 1 and |«,| 2c>O0 for all n. The vectors w,, ...,w, are lin- 
early independent, for otherwise, if n 2 2 and 


C,W, tees t+ C,AWn = 0, 
then we apply u to this relation, and get 
C,0,W, +°°°+C,0,W, = 0. 
We divide by «, and subtract, obtaining 
c(1 ~~ Ob /X,)W tre c,(1 ~~ Ot, /%1)W, = 0. 
By induction, we could assume wy, ...,w, linearly independent, and hence 
C> ='''=c, = 0, whence c, = 0, as was to be shown. We let H,, be the 
space generated by w,,...,w,- By Lemma 2.2 we can find x, ¢€ H, such 
that |x,| = 1 and 
Xn ~ y| = l—é 
for all ye H,_,. Then for k <n we get for some ye H,-: 
|uX, — UXy| = |e ,X, — yl 2 c(1 — é). 

This contradicts the compactness of u, and proves that the number of 
eigenvalues a such that |«| 2 c is finite. 

Thus we can order the eigenvalues in a sequence {a,, #,,...} such that 


Jo;] 2 loses 


and we get lim «, = 0. This proves our theorem. 


io 


Let {a;} be the sequence of eigenvalues of u, ordered such that 
\ox;| = |a;4,|. Let r; be the exponent of a,;. We can form the subspaces 


F,= > Ker(u — a,J)", 
i=1 


and the sum is direct since each Ker(u — «,)" is finite dimensional. Then 
we get an ascending sequence of subspaces 


Fechc crc: 
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Similarly, each one of these subspaces has a complementary closed sub- 
space H, which can be described in various ways. For instance, we can 
proceed by induction, assuming that we have already found such a closed 
u-invariant subspace H,. Then u|H, is a compact operator, whose eigen- 
values are precisely a; for j > n, and we can decompose H, as in Theorem 
3.3. Assuming inductively that 


H, =Im |] (u— 4,1)", 
i=1 


we conclude that the same relation holds when n is replaced by n + 1. 
We get a decreasing sequence 


H,>H,>°::>H,>°: 


and a direct sum decomposition 


E=F@H,. 


This decomposition of E as a direct sum is what we call the spectral 
theorem for compact operators. Our method of proof is due to Riesz. Cf. 
[Di] and [R-N]. 

We conclude this section with some remarks in case we are given a 
compact operator on a normed vector space which is not complete. This 
is often the case, when we are given an operator say on C® functions, 
and we extend it to the completion of the space of C® functions with 
respect to some norm. 


Theorem 3.5. Let E be a normed vector space and u: E- E a compact 
operator. Let E be the completion of E, and let 


u:E-E 


be the continuous linear extension of u. Then u maps E into E itself. If 
a #0 is an eigenvalue of u of exponent r, and N, = Ker(u — al)’, then 
N, is contained in E. 


Proof. Let xe E and let {x,} be a sequence in E approaching x. 
Then {ux,} has a subsequence which converges in E, and hence ux lies 
in FE, thus proving our first assertion. As to the second, let x be an 
eigenvector of 7 in E belonging to the eigenvalue a #0. Then 


x = a u(x), 
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whence x lies in E. Inductively, suppose that the kernel of (% — aI)‘ 
is contained in E, and suppose that (u@ — al)**!x =0 for some xe E. 
Then (7 — al)x = y lies in the kernel of (u% — aJ)* and hence lies in E. 
Therefore 


x =a '(ux — y) 


lies in E, thus proving our theorem. 


XVII, §4. APPLICATION TO INTEGRAL EQUATIONS 


We consider a continuous function K(x, y) of two variables ranging over 
a square [a, b] x [a,b]. Then we obtain an operator S, such that 


b 


(1) Sx f(x) = | K(x, y) f(y) dy. 


a 


We shall consider this operator with respect to two norms on the space 
E of continuous functions on [a, b]. 


Case 1. We take E with the sup norm, so that E is a Banach space. 
Then Sx is compact. 


Indeed, this follows trivially from Ascoli’s theorem, because K is uni- 
formly continuous, and if ® is a subset of E, bounded by C>0, then 
estimating the integral in the usual way shows that S,(®) is bounded, 
and for f €® we have 


b 
(2) Sk f(x) — Skf(Xo)| <| K(x, y) — K(x, WIFI dy 
< C(b— ae 
whenever |x — X9| < 6. Hence S,(®) is equicontinuous, and our assertion 
is proved. 


Thus we can apply the spectral theorem for compact operators. 


Case 2. We take E with the L*-norm, arising from the hermitian 
product 


Cf 9) = | f(Qg(t) at. 


Then again Sx is compact, even as a linear map of E into itself (even 
though E is not yet complete with respect to the L?-norm!). 
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Proof. Similar to the proof in the preceding case, except that we 
estimate the integral by the Schwarz inequality, namely 


KA 9>| S Ifllaliglla- 


If M is a bound for K on the square, and K,, is the function such that 
K,(y) = K(x, y), then ||K,||, < M(b—)'/?, and hence by Schwarz, from 
(1) we even get a bound for the sup norm: 


ISkfll S$ M(b— a)" |If lla, 


so that if f lies in an L?-bounded set, then S,f lies in a C°-bounded set. 
Similarly, estimating (2) with the Schwarz inequality shows that if ® is an 
L?-bounded set, then S,(®) is equicontinuous, so that we can apply 
Ascoli’s theorem to S,(®). Note that S,(®) in this case is relatively 
compact with respect to the sup norm, let alone the L?-norm, even 
though we started with a set ® which was only L?-bounded. 


The spectral theorem applies therefore in the present case, and so does 
Theorem 3.5, which showed that the finite dimensional spaces corre- 
sponding to the eigenvalues #4 0 actually had bases with elements in E 
rather than in the L?-completion of E. 

If we take K to be hermitian, for instance real valued and such that 
K(x, y) = K(y, x), then we have a Fourier expansion of any function in E 
as an L?-convergent series. One can then start playing the same game as 
in the ordinary theory of Fourier series, and ask for uniform or pointwise 
convergence. We leave this to look up for readers who have a more 
direct interest in integral equations. 

Several more examples of compact integral operators will be given in 
the exercises. 
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1. Let E be the space of C®” functions of one variable, periodic of period 2z. 
Let D be the derivative. Denote by E, the space E together with the norm 
arising from the hermitian product 


CL Do= | f9; 


and for each positive integer p denote by E, the same space but with the 
product 


hp =< Do + <Df, Dg>o + °°° + (Df, D?g)o.- 
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(a) If fe E and c, is the Fourier coefficient of f with respect to the function 
e'*x (k integer), show that c, goes to 0 like 1/k?, for k > oo. Use integra- 
tion by parts. 

(b) Show that T = I — D? is a toplinear isomorphism 


T: E, > Eo 


by constructing an inverse S, using term by term integration of the 
Fourier series. 
(c) Show that this inverse is compact. 


2. (a) Let E be the normed vector space of continuous functions on [0, 1] with 
the sup norm. Let S$: E— E be the linear map such that 


Sf(x) = | F(t) dt. 


Show that S is continuous, and that |S"|'"—>0 as n> oo. [Hint: Show 
that |(S"f)(x)| S ||f||x"/n! by induction. You will need some inequality 
like n! 2 n"e'".] 

(b) Show that 0 is the only element in the spectrum of S, and that S is 
compact. 

(c) For each « #0, given a continuous function g € E, show that there exists 
a continuous function f € E such that 


Sf—af =g. 


Express f explicitly as an integral involving g and the exponential 
function. 


3. Let J = [0,1] and let E be the vector space of all C* paths «: J—+R"”. Let 
|| || be the sup norm and | | the euclidean norm on R". Given two paths «, B 
define 


<a, Bo = | <a(t), B(t)> dt, 


the product <a(t), B(t)> being the dot product. Its associated norm is called 
the H°-norm on E, and will be denoted by || ||). Define 


<a, B>, = <a(0), B(O)> + <Da, DB 0. 
(a) Show that this is a positive definite scalar product, and that its norm, 


which we call the H‘-norm and denote by || ||,, is equivalent to the norm 
arising from the scalar product 


(a, B)t+ <a, B>o + (Da, DB )o. 
(b) Show that for «é€ E and s, te J, we have 


|a(s) — a(z)| S |t — s|*/?||Dello. 
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and 
lll] S 2Ilal],. 


(c) Let H°(J, R") be the completion of E with respect to the H°-norm, and let 
H'(J, R") be its completion with respect to the H'-norm. Show that the 
identity mapping on E induces injective continuous linear maps 


H1U,R") > C°U,R") and ~—s (VJ, R") > H°(VJ, R"). 


Show that both these maps are compact. 


. If you have not already done them, do Exercise 18 of Chapter IV and 
Exercise 7 of Chapter V. 


. Let U be a bounded open set in a Banach space E and let F be a finite 
dimensional space. Let p21. Let BC?(U,F) be the space of C’ maps 
f:U—F such that D«f is bounded for k =1, ...,p. Show that the iden- 
tity map of BC?’*'(U, F) into BC?(U, F) is a compact operator. [Hint: Use 
Ascoli’s theorem and the mean value theorem. ] 


. Let H, E be Hilbert spaces, and let A: H — E be an operator. Show that A is 
compact if and only if A maps weakly convergent sequences into strongly 
convergent sequences. (A sequence is said to converge weakly if the sequence 
obtained by applying any functional converges. It converges strongly if it 
converges in the usual norm of the Hilbert space.) Show also that if A is 
compact and v,—0 weakly in H, then Av, >0 strongly in E. [Hint: You 
may use the principle of uniform boundedness, and also the fact that the unit 
ball is closed in the weak topology, see Exercise 10 of Chapter IV. Cf. 
SL,(R), Appendix 4, for details of proof. ] 


. Let H, E be Hilbert spaces and let A: H + E be a compact linear map. Let 
{e;} (i= 1,2,...) be an orthonormal basis in H. Let H(N) be the closed 
subspace generated by the e; with i= N. Show that given «, there exists N 
such that for all v e H(N), we have 


|Avle S élvly. 
[ Hint: Use the preceding exercise. If the conclusion is false, pick a sequence 
v, € H(n) of unit vectors such that |Av,| > ¢.] 


. Let H, be the Hilbert space defined in Exercise 6 of Chapter VII. Following 
Exercise 8 of that chapter, if r<s, prove that the inclusion H, — H, is a 
compact linear map. 


The next three exercises give examples of compact operators which are called 


Hilbert—Schmidt. For a systematic treatment of such operators in general Hilbert 
spaces, see Exercises 17 and 18 of Chapter XVIII. 


9. Hilbert-Schmidt operators in L?. Let (X,.@, dx) and (Y, ./, dy) be measured 

spaces. Assume that L?(X) and L?(Y) have countable Hilbert bases. 

(a) Show that if {@,} and {y,} are Hilbert bases for L?(X) and L(Y), respec- 
tively, then {p;@ ,} is a Hilbert basis for L*(dx @ dy). 
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(b) Let K € L?(dx @ dy). Prove that the operator 


feeS,f from  L?(Y)>L?(X) 
given by 


Sx f(x) = | K(x, y) f(y) dy 
Y 


is compact. [Hint: Prove first that it is bounded, with bound |K'||,. 
Using partial sums for the Fourier expansion of K, show that S,; can be 
approximated by operators with finite dimensional images. Cf. Theorem 2 
of SL,(R), Chapter I, §3.] 

Let H be a Hilbert space with countable Hilbert basis. An operator A 
on H is said to be Hilbert-Schmidt if there exists a Hilbert basis {9;} 
such that 


» |Agil? < 00. 


(c) If Y = X, show that S, is a Hilbert—Schmidt operator. 
(d) Conversely, let T: L?(X)— L?(X) be a Hilbert-Schmidt operator. Show 
that there exists K € L?(X x X) such that T = Sy. 
[Hint: Let Ty, = ¥\ t,;9;. Show that ) |t;;|? < oo, and let 
J 


K= » t:(@; © Q;). 
t,J 


For a more subtle result along these lines, cf. SL,(R), Theorem 6 of 
Chapter XII, §3.] 


10. Assume that (X, .@, dx) =(Y, WV, dy) in the preceding exercise, and that X has 
finite measure. Let 


K = ¥) Can Pm @ Dn 


be the Fourier series for K. Let P,,,, be the integral operator defined by the 

function @,, © Q,. 

(a) Show that P,Q, = Om and P,,,9; = 0 if h #n. 

(b) Assume that the coefficients c,,,, tend to 0 sufficiently rapidly. Show that 
K is in L' and that 


| K(x, x) dx => Can. 
xX n 


(c) Again assume that the coefficients c,,,, tend to 0 sufficiently rapidly. Show 
that 


y CSKOn> On? = » Cnin> 


Under suitable convergence conditions, this gives an integral expression 
for the “trace” of S,. 
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11. 


12. 


(d) If the Fourier coefficients of K tend to 0 sufficiently rapidly, show that 
Sx is the product of two Hilbert-Schmidt operators. By definition, this 
means S, is of “trace class”. See Exercise 23 of Chapter XVIII. [Hint: 
Look at the technique of the next exercise. ] 


Let T again be the circle, viewing functions on T as functions on the reals 
which are periodic of period 1. Let K be a C® function on T x T. 
(a) Show that the integral operator S, given by 


Sx f(x) = | f(y) K(x, y) dy 


is the product of two Hilbert—Schmidt operators. 
(b) Show that with the notation of Exercise 10, we have 


1 
| K(x, x) dx =) c,,. 


0 


[Hint: Let {@,} be the Hilbert basis given by @,(x) = e?7"*. Let 


1 
B=) Cmn(l+n?)Py, and C=) ——P. 
m,n j 1+ J 


Show that BC = S,.] 
Let T be the circle as before, and for f € L?(T) define 


1 


Cf(x) = | e2til=—) F(R) dt. 


0 


(a) Prove that C is a compact operator on L?(T). 
(b) Prove that C* = C (so C is self-adjoint). 
(c) Describe the spectrum of C. 


. Prove the following theorem (worked out in SL,(R), Chapter XII, §3). 


Theorem. Let X be a locally compact space with a finite positive measure uw. 
Let H be a closed subspace of L?(X, ) = L?(X), and let T be a linear map of 
H into the vector space BC(X) of bounded continuous functions on X. Assume 
that there exists C > 0 such that 


[Tf SCF. forall fed, 
where || || is the sup norm. Then 
T: H > L?(X) 


is a compact operator, which can be represented by a kernel in L*(X x X), as 
in Exercise 9. 


CHAPTER XVIII 


Spectral Theorem for 
Bounded Hermitian Operators 


This chapter may be viewed as a direct continuation of the linear algebra 
in the context of Hilbert space first discussed in Chapter V. 


XVIII, §1. HERMITIAN AND UNITARY OPERATORS 


Let E be a Hilbert space. We recall that an operator A:E-E is a 
continuous (or bounded) linear map. We defined the adjoint A* in Chap- 
ter V, §2. An operator A such that A = A* is called hermitian, or self- 
adjoint. If E is a real Hilbert space, then instead of hermitian we also say 
that A is symmetric. If A is invertible, then one sees at once that 


(A)* = (AY. 


The case when A = A®* is the main one studied in this chapter. For a 
complex Hilbert space, the following properties are equivalent, concerning 
an operator A: 


We have A = A*. 
The form @,: (x, y)H> (Ax, y> is hermitian. 


The numbers «Ax, x) are real for all x € E. 


The equivalence between the first two is left to the reader. As to the 
third, suppose that A = A*. Then 


CAx, X> = (x, A*x)> = (x, Ax) = (Ax, x> 
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so <Ax, x> 1s real. Conversely, assume that this is the case. Then 
(Ax, X) = <x, Ax) = (A*X, x) 


for all x, whence <(A — A*)x, x> = 0 for all x, and A = A* by polariza- 
tion (Theorem 2.4 of Chapter V). 

Let E be a Hilbert space, and A an invertible operator. Then the 
following conditions are equivalent: 

UN 1. A* =A". 

UN 2. |Ax| = |x| for all x € E. 

UN 3. <Ax, Ay> = <x, y> for all x, ye E. 

UN 4. |Ax| = 1 for every unit vector x € E. 

An invertible operator satisfying these conditions is said to be hilber- 
tian, or unitary. The equivalence between the four conditions is very 
simple to establish and will be left to the reader (Exercise 3). The set of 
unitary (or hilbertian) operators is a group, denoted by Hilb(E). 

In §7 you will see how to decompose an arbitrary operator into 
a product of hermitian and unitary operators. Thus hermitian and uni- 


tary operators are the basic ones. We shall now study especially the 
hermitian ones. 


XVIll, §2. POSITIVE HERMITIAN OPERATORS 


We wish to see how much information on the norm of A can be derived 
from knowing the values of the quadratic form <Ax, x). 


Lemma 2.1. Let A be an operator, and c a number such that 
|<Ax, x>| S ¢|x|? 
for all x € E. Then for all x, y we have 
|< Ax, y>| + [<x, Ay>| S 2c|x||yI. 
Proof. By the polarization identity, 
2|<Ax, y> + CAy, x>| Selx + yl? + lx — yl? = 2e(|x|? + |y|?). 


Hence 
|< Ax, y> + <Ay, x>| S e(|x|? + |y|?). 
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Figure 18.1 


We multiply y by e’® and thus get on the left-hand side 
je“® Ax, y> + e’CAY, x)|. 


The right-hand side remains unchanged, and for suitable 0, the left-hand 
side becomes 


|< Ax, y>| + [<Ay, xl. 


(In other words, we are lining up two complex numbers by rotating one 
by @ and the other by —6.) Next we replace x by tx and y by y/t for t 
real and t>O. Then the left-hand side remains unchanged, while the 
right-hand side becomes 


1 
g(t) = t*|xI? + Slyl, 


The point at which g’(t) = 0 is the unique minimum, and at this point fo 
we find that 


g(to) = |x\lyl. 
This proves our lemma. 


Theorem 2.2. Let A be a hermitian operator. Then |A| is the greatest 
lower bound of all values c such that 


|< Ax, x>| S e|x|? 


for all x, or equivalently, the sup of all values |{Ax, x>| taken for x on 
the unit sphere in E. 


Proof. When A is hermitian we obtain 


|< Ax, y>| S clxl ly! 
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for all x, ye E, so that we get |A| <c in the lemma. On the other hand, 
c =|A| is certainly a possible value for c by the Schwarz inequality. This 
proves our theorem. 


Theorem 2.2 allows us to define an ordering in the space of hermitian 
operators. If A is hermitian, we define A 2 O and say that A is positive 
if <Ax,x> =O for all xe E. If A, B are hermitian we define A 2 B if 
A—Bz=O. This is indeed an ordering, and the usual rules hold: if 
A, = B, and A, 2 B,, then 


If c is a real number = 0 and A 2 O, then cA 2 O. So far, however, we 
have said nothing about a product of positive hermitian operators AB, 
even if AB = BA. We shall deal with this question later. 

Let c be a bound for A. Then |< Ax, x>| < c|x|? and consequently 


—cIl<AsScel. 


For simplicity, if « is real, we sometimes write a < A instead of al < A, 
and similarly we write A < f instead of A < BI. If we let 


a= inf <Ax, x) and B = sup <Ax, x), 


[x]=1 |x|=1 


then we have 
asASB, 


and from Theorem 2.2, 
|A| = max(|a|, |}). 


The next two sections are devoted to generalizing to Hilbert space the 
spectral theorem in the finite dimensional case. These two sections are 
logically independent of each other. In the finite dimensional case, the 
spectral theorem for hermitian operators asserts that there exists a basis 
consisting of eigenvectors. We recall that an eigenvector for an operator 
A is a vector w #0 such that there exists a number c for which Aw = cw. 
We then call c an eigenvalue, and say that w, c belong to each other. In 
the next section, we describe a special type of hermitian operator for 
which the generalization to Hilbert space has the same statement as in 
the finite dimensional case. Afterwards, we give a theorem which holds 
in the general case, and can be used as a substitute for the “basis” 
statement in many applications. Some of these applications are described 
in subsequent sections. 
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XVII, §3. THE SPECTRAL THEOREM FOR COMPACT 
HERMITIAN OPERATORS 


An operator A: E-— E is said to be compact if given a bounded sequence 
{x,} in E, the sequence {Ax,} has a convergent subsequence. It is pre- 
cisely this condition which will allow us to get an orthogonal basis for a 
hermitian operator. It is clear that if E is finite dimensional, every 
operator is compact. 


Throughout this section, we let E be a complex Hilbert space, and 
A: EE a compact hermitian operator. 


A subspace V of E is called A-invariant if AV c V, ie. if xe V, then 
AxeV. If V is A-invariant, then its closure is also A-invariant. Further- 
more V~ is A-invariant because if x e V~, then for all ye V we get 


<y, Ax> = <Ay, x» = 0. 


We recall that the spectrum of A is the set of numbers c such that 
A-—clI is not invertible. We note that an eigenvalue c of a hermitian 
operator is real, because if w is an eigenvector belonging to c, then 


c(w, w> = (Aw, w> = Cw, AWD = CCw, Ww», 


so c=C. Since the 1-dimensional space generated by an eigenvector is 
A-invariant, it follows that the orthogonal complement of this space is 
A-invariant. 

For each eigenvalue c let E, be the space generated by all eigenvectors 
having this eigenvalue, and call it the c-eigenspace. Then for every x e€ E, 
we have Ax = cx. We note that E, is a closed subspace, and Ep, is the 
kernel of A. 

Let w, and w, be eigenvectors belonging to eigenvalues c,, c, respec- 
tively, such that c,; #c,. Then w, lL w,, because 


C1 (Wy, W2> = (AW, W2> = (Wy, AW2> = C2<Wy, W2D. 
Consequently E, is orthogonal to E,.. 
Suppose that V is a non-zero finite dimensional A-invariant subspace of 
E. Then A restricted to V induces a self-adjoint operator on V, and 


there exists an orthogonal basis of V consisting of eigenvectors for A. 


This is a trivial fact of linear algebra, which we reprove here. Let w 
be a non-zero eigenvector for A in V, and let W be the orthogonal 
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complement of w in V. Then W is A-invariant, and we can complete the 
proof by induction. 


Theorem 3.1 (Spectral Theorem). Let A be a compact hermitian opera- 
tor on the Hilbert space E. Then the family of eigenspaces {E.}, where 
c ranges over all eigenvalues (including 0), is an orthogonal decomposi- 
tion of E. 


Proof. Let F be the closure of the subspace generated by all E, (as in 
Corollary 1.9 of Chapter V), and let H be the orthogonal complement of 
F. Then H is A-invariant, and A induces a compact hermitian operator 
on H, which has no eigenvalue. We must show that H = {0}. This will 
follow from the next lemma. 


Lemma 3.2. Let A be a compact hermitian operator on the Hilbert 
space H # {0}. Let c=|A|. Then c or —c is an eigenvalue for A. 


Proof. There exists a sequence {x,,$ in H such that |x,| = 1 and 
|<AX,, Xn) | > Al. 
Selecting a subsequence if necessary, we may assume that 
(AXn> Xp? PO 
for some number a, and «7 = +|A|. Then 


0 <|Ax, — ax,|? = (Ax, — ax,, AX, — OX_> 
= | Ax,,| _ 24¢AX,, Xn> + a*|x,,|7 


< a? — 20<Ax,, X,> + a7. 


The right-hand side approaches 0 as n tends to infinity. Since A is 
compact, after selecting a subsequence, we may assume that {Ax,} con- 
verges to some vector y, and then {ax,} must converge to y also. If 
a = 0, then |A| =0 and A =O, so we are done. If « #0, then {x,} itself 
must converge to some vector x, and then Ax =ax so that a is the 
desired eigenvalue for A, thus proving our lemma, and the theorem. 


We observe that each E. has a Hilbert basis consisting of eigenvectors, 
namely any Hilbert basis of E., because all non-zero elements of E, are 
eigenvectors. Hence E itself has a Hilbert basis consisting of eigen- 
vectors. Thus we recover precisely the analog of the theorem in the finite 
dimensional case. Furthermore, we have some additional information, 
which follows trivially: 


Add BOUNDED HERMITIAN OPERATORS [XVIII, §4] 


If c #0, each E, is finite dimensional; otherwise a denumerable subset 
from a Hilbert basis would provide a sequence contradicting the com- 
pactness of A. For a similar reason, given r >0, there is only a finite 
number of eigenvalues c such that |c|2r. Thus 0 is a limit of the se- 
quence of eigenvalues if E is infinite dimensional. 


XVIII, §4. THE SPECTRAL THEOREM FOR 
HERMITIAN OPERATORS 


Let p be a polynomial with real coefficients, and let A be a hermitian 
operator. Write 


p(t) =a,t" + °°: + do. 
As in Chapter IV, §5, we define 
p(A) = a,A" + °°: + dol. 


We let R[A] be the algebra generated over R by A, that is the algebra of 
all operators p(A), where p(t)e R[t]. We wish to investigate the closure 
of R[A] in the (real) Banach space of all operators. We shall show how 
to represent this closure as a ring of continuous functions on some 
compact subset of the reals. First, we observe that the hermitian opera- 
tors form a closed subspace of End(E), and that R[A] is a closed sub- 
space of the space of hermitian operators. 
As observed at the end of §2, we can find real numbers a, f such that 


al <A BI. 


We shall prove that if p is a real polynomial which takes on positive 
values on the interval [«, B], then p(A) is a positive operator. For this 
we need a purely algebraic lemma. 


Lemma 4.1. Let p be a real polynomial such that p(t)20 for all 
te[a, B]. Then we can express p in the form 


p(t) = cL). Q(t) + Do (t — Q(t)? + > (6B — HQ,(0)7] 
where Q;, Q;, Q, are real polynomials, and c 2 0. 


Proof. We first factor p into linear and irreducible quadratic factors 
over the real numbers. If p has a root y such that a <y < B, then the 
multiplicity of y is even (otherwise p changes sign near y, which is impos- 
sible), and then (t — y) occurs in an even power. If a root y is S a we 
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have a linear factor t — y which we write 
t—y=(t—a)+(a—y) 


and note that « —y is a real square. If y is a root 2 f, then we write 
the linear factor as 


y—t=(y—B)+(B—-2) 


and note that y — B is a real square. In a factorization of p we can take 
the factors to be of type (t — y)?"” if y is root such that « < y < B, and 
otherwise to be of type t — y or y —t according as y<a or y>f. The 
quadratic factors are of type (t — a)? + b*. The constant c (which can be 
taken as a constant factor) is then = O since p is positive on the interval. 
Multiplying out all these factors, and noting that a sum of squares times 
a sum of squares is a sum of squares, we conclude that p has an expres- 
sion as stated in the lemma, except that there still appear terms of type 


(t — a)(B — t)Q(t)’ 


where Q is a real polynomial. However, such terms can be reduced to 
terms of the other types by using the identity 


(¢ — a)*(B—-) + — (Bt 


(¢— a)(B—1) = a 


This proves our lemma. 


Now to study R[A], we observe that the map 


p> p(A) 


is a ring-homomorphism of R[t] onto the ring R[A]. Furthermore, if B, 
C are hermitian operators such that BC = CB and B 2 O, then trivially, 
BC? is positive because 


<BC?x, x» = <CBCx, x> = <BCx, Cx> = 0. 


The sum of two positive hermitian operators is positive. Hence from the 
expression of p in the lemma, we obtain 


Lemma 4.2. If p is positive on [a, B], then p(A) is a positive operator. 
If p, q are polynomials such that p<q on [a, B], then p(A) S q(A). 
Finally, 

|p(A)| S |Ipll 


the sup norm being taken on [a, B |. 
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Proof. The first assertion comes from the remarks preceding our 
lemma. The second follows at once by considering q — p. Finally, if we 
let 


q(t) = |lpll = p(t) 


then g =0 on [a, B] and hence q(A) = O, whence the last assertion fol- 
lows from Theorem 2.2. 


We conclude that the map 
pt p(A) 


is a continuous linear map from the space of polynomial functions on 
[«, B] into R[A]. By the linear extension theorem, we can extend this 
map to the Banach space of continuous functions on [a, B] by continu- 
ity, and thus we can define f(A) for any continuous function f on [a, f], 
by the Stone—Weierstrass theorem. If {p,} is a sequence of polynomials 
converging uniformly to f on [a, 6], then by definition, 


f(A) = lim p,(A). 


Furthermore, again by continuity, we have 


IF(A)] S SIL 


the sup norm being taken on [a, B]. If p, > f and q,-—-g, then p,q, > fg. 
Hence we obtain (fg)(A) = f(A)g(A) for any continuous functions, f, g. 
In other words, our map is also a ring-homomorphism. 


Theorem 4.3. If A =O, then there exists Be R[A] such that B? = A. 
The product of two commuting positive hermitian operators is again 
positive. 


Proof. The continuous function t'/* maps on a square root of A in 
R[A], and it is clear that any element of R[A] commutes with A. If A, 
C commute and we write A = B? with B in R[A], then B and C also 
commute because C commutes with p(A) for all real polynomials p, and 
hence C commutes with all elements of R[A]. But as we have seen, if 
C = 0, then B*C = O. This proves our theorem. 


The kernel of our map ft» f(A) is a closed ideal in the ring of 
continuous functions on [a, B]. We forget for a moment our definition 
of the spectrum given in Chapter XVI, §1, and here define the spectrum 
a(A) to be the closed set of zeros of this ideal. We use Theorem 2.1 of 
Chapter III. 

If f is any continuous function on o(A), we extend f to a continuous 
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function on [a, B] having the same sup norm, say f,, and define 


f(A) = fil). 


If g is another extension of f to [a, BJ, then g — f, vanishes on oa(A), 
and hence g(A) = f,(A). Hence f(A) is well defined, independently of the 
particular extension of f to [a, B]. We denote by || ||, the sup norm 
with respect to o(A); thus 


lf lla = 


t 


sup |f(¢)I. 
€ a(A) 


We then obtain a ring-homomorphism from the ring of continuous func- 
tions on o(A) into R[A], and we have 


If(A)| S WP lla. 


We now state the spectral theorem. 


Theorem 4.4. The map ft» f(A) is a Banach-isomorphism from the 
algebra of continuous functions on o(A) onto the Banach algebra R[A]. 
A continuous function f is = 0 on a(A) if and only if f(A) 2 O. 


Proof. We had derived the norm inequality previously from the posi- 
tivity statement. We do this again in the opposite direction. Thus we 
assume first that f(A) = O and prove that f is 2 0 on the spectrum of A. 
Assume that this is not the case. Then f is negative at some point c of 
the spectrum. Let g be a continuous function whose graph is as follows: 


a c B 


Figure 18.2 


Thus g is = 0, and has a positive peak at c. Then fg is < 0 and fg is 
negative at the point c of the spectrum. Hence —fg 20, and hence 
—f(A)g(A) 2 O. But f(A) 2 O and g(A) 2 O, so that by Theorem 4.3 we 
also have f(A)g(A) = O. This implies that f(A)g(A) = O, which is impos- 
sible since fg does not vanish on the spectrum. We conclude that f 2 0 
on o(A), and in view of our previous result this proves the positivity 
statement of the theorem. 
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Now for the norm, let b=|f(A)|. Then bI + f(A) =O, whence 
b + f(t) 2 0 on the spectrum. This proves that 


Iflla S IAD 


and hence a sequence {f,(A)} converges if and only if the sequence of 
continuous functions {f,} converges uniformly on the spectrum. This 
concludes the proof of the spectral theorem. 


There remains to identify the spectrum as we have defined it in this 
section, and the spectrum of Chapter XVI, §1, which we shall call the 
general spectrum. 


Corollary 4.5. If A is hermitian, then the spectrum o(A) is equal to the 
set of complex numbers z such that A — zI is not invertible. 


Proof. Let z be complex and such that A — ZI is not invertible. Then 
z is real, for otherwise let 


g(t) = (t — z)(t — 2). 


Then g(t)#0 on o(A), and hence h(t) = 1/g(t) is its inverse. Then 
h(A)(A — ZI) would be an inverse for A—2zI, a contradiction. This 
proves that z is real. 

Let € be real and not in the spectrum o(A). Then t — é is invertible 
on a(A), and hence so is A — €]. 

Suppose that € is in the spectrum o(A). Let g be the continuous 
function whose graph is as follows. 


g(t) 


Figure 18.3 


That is, 


= yes if |t — €| = 1/N, 
Iw VN if jt — €| <1/N. 
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If A — EI is invertible, let B be an inverse, 
B(A — 1) =(A — ENB =I. 
Since |(t — &)g(t)| < 1 we get |(A — EI)g(A)| S 1, whence 
|g(A)| = |B(A — CI)g(A)| S | BI. 


But g(t) has a large sup on the spectrum if we take N large, and hence 
|g(A)| is equally large, a contradiction. Theorem 4.4 is proved. 


The main idea to use the positivity to get the spectral theorem is due 
to F. Riesz. However, most treatments go from the positivity statement 
to an integral representation of A which we give in Chapter XX. Von 
Neumann always emphasized that it is much more efficient to prove at 
once the statement of Theorem 4.4, which suffices for many applications, 
and can be obtained quite simply from the positivity statement. In fact, 
the arguments used to derive Theorem 4.4 from the positivity statement 
are taken from a seminar of Von Neumann around 1950. 


Example. Let A be hermitian as above. Given a real number s, 
consider the continuous function 


f(t) = e™. 


Then the operator f,(A) = e “4 is defined. It is an easy matter to show 
that the association s»e~4 is a C® map from R into Laut(E£), that it is 
also a homomorphism, and satisfies the conditions: 


d 
H 1. —e 4 = — Ae. 
ds 
H 2. For every v € E, we have 


lim e~*4v = v. 


s70 


One calls e~*4 the Heat operator associated with A. 


For examples of uses of Theorem 4.3, see §7. 


XVill, §5. ORTHOGONAL PROJECTIONS 


Corollary 1.8 of Chapter V shows that we have orthogonal decompositions 
in Hilbert space similar to those in euclidean spaces. A standard criterion 
for such decompositions in algebra generalizes to Hilbert spaces, namely: 
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Theorem 5.1. Let P, Q be hermitian operators such that 
Pp? =P, PQ = QP =0O, P+Q=I. 
Then Q? = Q, and we have 
Ker P = Im Q = (Ker Q)’. 
In particular, we have the orthogonal decomposition 
FE = Ker P + Im P. 


Proof. This proof is independent of the spectral theorem, and uses 
only basic definitions, together with Corollary 1.8 of Chapter V. Let 
F = Ker P. If xe F, we have 


x = Ix = Px + Qx = Qx 


so that x is in the image of Q. Since PQ = QP = O, it follows that the 
image of Q is in the kernel of P, whence Ker P=ImQ. We obviously 
have 

Q? =(I- PP? =1-P=Q 


so that our relations between P and Q are symmetric. We still must 
show that F+ = Ker Q. Suppose that <F,x> =0. Then from 


(E, Qx> = (QE, x> = (F, x) 


we conclude that Ox =0 so F+ < Ker Q. The converse inclusion follows 
from these same equalities, and our theorem is proved. 


Let A be an operator and let F be a subspace of E. We say that F is 
invariant for A if AF c F (that is Axe F for all xe F). If this is the case, 
then it is clear that the closure F is also an invariant subspace. 

Let A, B be operators such that AB = BA. Then Ker B and Im B are 
invariant subspaces for A. Indeed, if Bx =0, then BAx = ABx =0, so 
Ker B is invariant. If y = Bx, then Ay = ABx = BAx, so Im B 1s invariant. 

An operator A which is hermitian is said to be positive definite if 
A=cI >O for some c>0. If F is a closed invariant subspace for A, we 
say that A is positive definite on F if the restriction of A to F is positive 
definite. (This restriction is clearly hermitian.) We say that A is negative 
definite if — A is positive definite. 


Corollary 5.2. Let A be an invertible hermitian operator. Then there 
exists an orthogonal decomposition E = F + F+ such that F, F+ are 
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A-invariant closed subspaces, and such that A is positive definite on F 
and negative definite on F°. 


Proof. We use the spectral theorem. Let g be the function such that 
g(t) = 1 if t=0 and g(t) =0 if t <0. Since A is invertible, it follows that 
0 is not in the spectrum of A. Hence g is continuous on the spectrum, 
and g* =g on the spectrum. Hence g(A) = P satisfies P* = P. Let F = 
Im P. Then P is an orthogonal projection on F by Theorem 5.1. Since 
A commutes with g(A), and since tg(t) <0 on the spectrum of A, it 
follows that AP = PA is a positive operator. Furthermore, A maps F 
into itself, and since A~’ exists on E, and also maps F into itself, let A* 
be the restriction of A to F. Then A” is positive, invertible on F, whence 
positive definite (because the spectrum is closed, and 0 is not in the 
spectrum). Similarly, let h(t) = 1 — g(t) and Q = h(A). Then th(t) <0 on 
the spectrum of A, and by similar arguments, letting A” be the restric- 
tion of —A to F+, we conclude that A™ is positive definite on F~. This 
proves what we wanted. 


Corollary 5.3. Let A be an invertible hermitian operator. Then there 
exist an orthogonal decomposition E = F + Ft and positive definite op- 
erators A* on F, A~ on F* such that if we write x = y +z with ye F 
and ze F*, then 


(Ax, x» = <A*y, yo — KAZ, 2). 
Proof. This is a rephrasing of the preceding result. 


Finally, for a positive operator, we can go one step further in our 
normalization. Namely, if A =O, then we can write A = B’ for some B 
in R[A], and hence if A = O, then the quadratic form xt» <Ax, x) can 
be written 


<Ax, xy» = (Al/?x, Al?x), 
If A is invertible, so is A‘/*. This corresponds to the diagonalization of 
quadratic (or symmetric bilinear) forms in the finite dimensional case. 
Indeed, in that case, a positive form can be written as 
AyYi tc + ann (a; 2 0) 
and a negative form can be written as 


—(byzp to + bz) (bh 2 9) 


with respect to a suitable orthonormal basis of the given positive definite 
hermitian product < , > on euclidean space. 
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XVIII, §6. SCHUR’S LEMMA 


Let S be a set of operators on the Hilbert space E, and let F be a 
subspace of E. We say that F is invariant under S if for every A eS and 
x €F we have Axe F. In other words, AF c F for every AeS. If F is 
invariant under S, we observe that its closure F is also invariant under S. 


Theorem 6.1. Let S be a set of operators on the Hilbert space E, 
leaving no closed subspace invariant except {0} and E itself. Let A be a 
hermitian operator such that AB = BA for all BES. Then A=cl for 
some real number c. 


Proof. It will suffice to prove that there is only one element in the 
spectrum of A. Suppose that there are two, c, #c,. There exist continu- 
ous functions f, g on the spectrum such that neither is 0 on the spec- 
trum, but fg is 0 on the spectrum. For instance, we can take for f, g the 
functions whose graphs are indicated on the next figure. 


Cj C2 


Figure 18.4 


We have f{(A)B = Bf(A) for all Be S (because B commutes with real 
polynomials in A, hence with their limits). Hence f(A)E is invariant 
under S because 


Bf(A)E = f(A) BE < f(A)E. 


Let F be the closure of f(A)E. Then F # {0} because f(A) 4 O. Further- 
more, F # E because g(A)f(A)E = {0} and hence g(A)F = {0}. Since F is 
invariant under S, we have a contradiction, thus proving our theorem. 


Corollary 6.2. Let S be a set of operators of the Hilbert space E, 
leaving no closed subspace invariant except {0} and E itself. Let A be 
an operator such that AA* = A*A, AT = TA, and A*T = TA* for all 
T eS. Then A =clI for some complex number c. 


Proof. Write A = B+ iC where B, C are hermitian and commute (e.g. 
B =(A + A*)/2 and C =(A — A*)/2i). Apply the theorem to each one of 
B and C to prove the corollary. 
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Remark. Schur’s lemma is used, among other places, in the represen- 
tation theory of groups. Let G be a group, and suppose that we have a 
homomorphism (called a representation) 


p: G— Laut(E) 


of G into the toplinear automorphisms of a Hilbert space E. Assume 
that G is commutative, and that the image p(G) satisfies the hypotheses 
of the set S in the corollary, and also is such that if Ae p(G), then 
A* € p(G). Then we conclude that for each o € G, the image p(c) is equal 
to c,J. Thus otc, is a homomorphism of G into the multiplicative 
group of complex numbers. In the terminology of representations, one 
says that an irreducible unitary representation of G is one dimensional, 
because E must then be of dimension 1. 


XVIII, §7. POLAR DECOMPOSITION OF ENDOMORPHISMS 


Among other things, this section shows some ways how Theorem 4.3 is 
used. Namely, for any operator T on a Hilbert space E, the operator 
T*T is hermitian positive, and so has a square root by Theorem 4.3. We 
start with a special case of the polar decomposition which is of interest 
for its own sake. 


Theorem 7.1. Let T: E—E be an operator on the Hilbert space E. 
Assume that Ker T = 0, and that TE is dense in E. Then Im(T*T)'” is 
dense, and there exists a continuous linear map U defined on this image 
such that 

U(T*T)'/*x = Tx. 


This operator U is norm preserving, and its kernel is {0}. 
Proof. To show that U is well defined, it suffices to prove that if 
(T*T)'?x = (T*T)'*y, 
then Tx = Ty. But applying (T*T)'/? yields T*Tx = T*Ty. On the 
other hand, the kernel of T* is {0}, because if T*u = 0 then for any ve E 


we have 
0 = <T*u, v> = <u, Tv, 


and since the image of T is dense, this implies that u is orthogonal to all 
of E, whence u = 0. Hence we get Tx = Ty, thus defining U uniquely by 
our given formula. Then we find 


((T*T)'°x, (T*T)'?y> = <x, T*Ty) = (Tx, Ty). 
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whence it follows that U preserves lengths on the domain of defini- 
tion. The fact that Ker U = {0} is a consequence of the norm-preserving 
property. 

The image of (T*T)'/? is dense, for suppose that y is orthogonal to 
this image. Let x = (T*T)'/*y. Then (T*T)'/?x = T*Ty, and we get 


0 = <T*Ty, y> = <Ty, Ty> = |Ty/’. 


Since Ker T = {0}, we get y= 0 whence Im(T*T)'”” is dense. This con- 
cludes the proof of the theorem. 


The norm-preserving linear map U in the theorem can then be ex- 
tended by continuity to all of H (which is the closure of its domain of 
definition), and this extension is a unitary automorphism of H. 

This result is useful in the theory of group representations. Indeed, 
suppose that we are given a group G and two homomorphisms 


R: G > Laut(E) and S: G > Laut(E) 


of G into the group of invertible operators on E. Assume that T and 7T* 
commute with (R, S) in the sense that for all o € G we have 
| 


TR(s) =S(o)T and — T*S(a) = R(o)T*. 


Verify that U satisfies similar relationships. This shows that our two 
representations are “isomorphic”. 

Next we deal with the general polar decomposition. Let H be a 
Hilbert space. Let U: HE be a bounded linear map into some other 
Hilbert space E. We say that U is a partial isometry if there exists an 
orthogonal decomposition 


H = Hy, t H, 
such that the restriction of U to H, is an isometry (norm-preserving 
linear map) onto the image of H,, and the restriction of U to H, is equal 
to 0. 


The purpose of this section is to prove: 


Theorem 7.2. Let A:H-—-H be an arbitrary operator. Then there 
exists a unique decomposition 


A = UP, 


where U is a partial isometry and P is hermitian positive. 
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Proof. We shall give the essential steps of the proof, and leave certain 
routine details as Exercise 15. 

First note that A*A is symmetric positive and has a unique symmetric 
positive square root, denoted by 


P, — (A*A)}/?, 
(a) We can define a linear map U = U, on Im P, by the formula 
U(A*A)'/2v = Ap, 


namely this is well defined. (If (A*A)!/2v = 0 then Av = 0.) 

(b) The map U:Im P, > Im A is a unitary map, which can therefore 
be extended by continuity to the closure of Im P,. Define U to be 
0 on the orthogonal complement of ImP,. Then U is a partial 
isometry. 

(c) The decomposition A= UP into a partial isometry U (relative 
to Im P) and a positive operator P is unique, ie. if A= WQ, then 
P=Q0,U=W. 

(d) We have A* = U,«P,+«, where U,« = U* and P,» = UP,U*. 


The decomposition A = UP is called the polar decomposition of A, and 
(d) gives the polar decomposition of A* in terms of the polar decomposi- 
tion of A. 


XVill, §8. THE MORSE-PALAIS LEMMA 


Let U be an open set in some (real) Hilbert space E, and let f be a C?*? 
function on U, with p=1. We say that x, is a critical point for f if 
Df(x9) = 0. We wish to investigate the behavior of f at a critical point. 
After translations, we can assume that x, =0 and that f(x))=0. We 
observe that the second derivative D7f(0) is a continuous bilinear form 
on E. Let 4 =D7f(0), and for each xeE let A, be the functional 
yr>A(x, y). If the map xboA, is a toplinear isomorphism of E with its 
dual space E’, then we say that / is non-singular, and we say that the 
critical point is non-degenerate. 

We recall that a local C?-isomorphism @ at 0 is a C?-invertible map 
defined on an open set containing 0. 


Theorem 8.1. Let f be a C?*? function defined on an open neighbor- 
hood of 0 in the Hilbert space E, with p21. Assume that f(0) = 0, and 
that 0 is a non-degenerate critical point of f. Then there exists a local 
C?-isomorphism at 0, say @, and an invertible symmetric operator A such 
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that 


F(x) = <AQ(X), P(X)>. 
Proof. We may assume that U is a ball around 0. We have 


1 


F(x) = f(x) — FO) = | Df(tx)x dt, 


0 


and applying the same formula to Df instead of f, we get 


f(x) = |. |. D? f(stx)tx +x ds dt = g(x)(x, x) 


0 
where 


g(x) = |. |. D? f(stx)t ds dt. 


Then g is a C? map into the Banach space of continuous bilinear maps 
on E, and even the space of symmetric such maps by Theorem 5.3 of 
Chapter XIII. We know that this Banach space is toplinearly isomorphic 
to the space of symmetric operators on E, and thus we can write 


F(x) = <A(x)x, x) 


where A: U >Sym(E) is a C? map of U into the space of symmetric 
operators on E. A straightforward computation shows that 


7D* f(0)(v, w) = <A(0)», w>. 


Since we assumed that D*f(0) is non-singular, this means that A(0) is 
invertible, and hence A(x) is invertible for all x sufficiently near 0. 

We want to define g(x) to be C(x)x where C is a suitable C? map 
from a neighborhood of 0 into the open set of invertible operators, and 
in such a way that we have 


(A(x)x, x> = CAD) P(X), G(x)> = CA(O)C(X)x, C(x) x. 
This means that we must seek a map C such that 
C(x)*A(0)C(x) = A(x). 


If we let B(x) = A(0)"*A(x), then B(x) is close to the identity I for small 
x. The square root function has a power series expansion near 1, which 
is a uniform limit of polynomials, and is C® on a neighborhood of I (cf. 
Exercise 2 of Chapter XIII), and we can therefore take the square root of 
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B(x), so that we let 
C(x) = B(x)'/?. 


We contend that this C(x) does what we want. Indeed, since both A(0) 
and A(x) (or A(x)~*) are self adjoint, we find that 


B(x)* = A(x)A(O)*, 
whence 
B(x)*A(0) = A(0) B(x). 


But C(x) is a power series in I — B(x), and C(x)* is the same power 
series in J — B(x)*. The preceding relation holds if we replace B(x) by 
any power of B(x) (by induction), hence it holds if we replace B(x) by 
any polynomial in I — B(x), and hence finally, it holds if we replace B(x) 
by C(x), and thus 


C(x)*A(0)C(x) = A(0)C(x)C(x) = A(0) B(x) = A(x), 


which is the desired relation. 

All that remains to be shown is that @ is a local C?-isomorphism at 0. 
But one verifies that in fact, Dp(0) = C(0), so that what we need follows 
from the inverse mapping theorem. This concludes the proof of Theorem 
8.1. 


Corollary 8.2. Let f be a C’*? function near 0 on the Hilbert space E, 
such that 0 is a non-degenerate critical point. Then there exists a local 
C?-isomorphism wW at 0, and an orthogonal decomposition E = F + F-, 
such that if we write x = y +z with ye F and ze F-, then 


S(WO)) =< YD — <2, Z>. 


Proof. The theorem reduces the problem to the case discussed in 
Corollaries 5.2 and 5.3. In that case, on a space where A is positive 
definite, we can always make the toplinear isomorphism xt» A’/?x to get 
the quadratic form to become the given hermitian product < , >, and 
similarly on the space where A is negative definite. 


Note. The Morse—Palais lemma was proved originally by Morse in 
the finite dimensional case, using the Gram-—Schmidt orthogonalization 
process. The elegant generalization and its proof in the Hilbert space 
case is due to Palais. Cf. [Pa 2]. It shows (in the language of coordinate 
systems) that a function near a critical point can be expressed as a 
quadratic form after a suitable change of coordinate system (satisfying 
requirements of differentiability). It comes up naturally in the calculus of 
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variations, [Pal] and [Sm 1]. For instance, one considers a space of 
paths (of various smoothness) o:[a,b]— E where E is a Hilbert space. 
One then defines a function on these paths, essentially related to the 
length 


fto)= | <a'(t), o'(t)> dt 


and one investigates the critical points of this function, especially its 
minimum values. These turn out to be the solutions of the variational 
problem, by definition of what one means by a variational problem. 
Even if E is finite dimensional, so a euclidean space, the space of paths 
is infinite dimensional, so that we need an infinite dimensional theory to 
deal with this question. 


XVIII, §9. EXERCISES 


1. Let E be a Hilbert space. 

(a) Let P be a hermitian operator such that P? = P. Show that P is an 
orthogonal projection on a closed subspace. 

(b) Conversely, let E = F + F+ be an orthogonal decomposition, where F is 
closed subspace. Let P be the orthogonal projection on F, and assume 
that F 4 {0}. (i) Show that |P| = 1, and that P? = P. (ii) Show that P is 
hermitian. 


2. Let A be hermitian and positive. Show that for all x, y we have 


|<Ax, yo? S |< Ax, x>IIKAY, y>I. 


3. Prove that the four conditions UN 1 through UN 4 defining a unitary opera- 
tor are equivalent. 


4. If A is hermitian, show that I + iA is invertible. 


5. Let A be an operator and let o(A) be its spectrum (we assume that our 
Hilbert space is complex). Show that 


o(A*) = o(A). 


6. Prove the statement made in the text that if A is compact, hermitian, then 
<Ax, x> takes on a maximum or minimum on the unit sphere. 


7. Let E be the space of real valued continuous functions on [0,1] and let 
M:E-E be the linear map given by 


(Mf)(x) = xf(x). 


We take the L?-norm on E, arising from the scalar product 


«fi o> -| f9; 
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10. 


and we let E, denote the completion of E with respect to this norm. Show 
that M is self-adjoint, and that for any real a, the operator M — al is not 
invertible on E,, for otherwise, it would be invertible on FE. Show that « is 
not an eigenvalue of M. Note: M is obviously injective on E, but you will 
have to prove that, for instance, it is injective on E,, so deal with L?-Cauchy 
sequences in E. 


. Let E be a complex Hilbert space with a denumerable orthonormal basis 


{x,} (n=1,2,...). Let S be a compact infinite subset of the complex 
numbers. Show that there exists a denumerable dense subset {«,} of S. Show 
that there exists a unique operator A on E such that Ax, =«, x, for all 
n, and that the spectrum of A is equal to S. Show that the eigenvalues of 
A are precisely equal to the numbers a,. Show that if « is in S and not 
equal to any a,, then the image of A—dalI is dense in E but not equal 
to E. 


. Let [? be the Hilbert space of sequences « = {a,}, n 21, such that ¥ |a,|? 


converges, with the hermitian product 
<a, B> =) a,b, 
if B = {b,}. Let T be the shift operator, that is 
Ta = (0, a,, a5, 3,...). 


Show that the spectrum of T is the unit disc and that T has no eigenvalue. 


Irreducible representations of compact groups. Let G be a compact group, and 
E a complex Hilbert space. A (unitary) representation R of G in E is a 
continuous homomorphism, 


R:G— Aut(E) 


of G into the group of (unitary) automorphisms of E. We let dx be a Haar 
measure on G such that G has measure 1. We say that a representation R is 
irreducible if there is no closed subspace of E invariant under R(G) other 
than 0 and E itself. The basic result is: 


Theorem. If R:G-— Aut(E) is a unitary irreducible representation of a 
compact group, then E is finite dimensional. 


Prove this by the following steps. Let {v;} be an orthonormal basis of E. 
Let P be the projection on the one-dimensional space generated by 0,. 
(a) Using Schur’s lemma, prove that there exists a number c such that 


| R(x)PR(x)7! dx = cl. 
G 


In fact, show that the operator on the left is a positive operator, commut- 
ing with all R(a), ae G. 
(b) Considering <cv,, v,>, show that c > 0. 
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11. 


12. 


13. 


14. 


15. 


(c) For any xeG, {R(x)‘v;} is an orthonormal basis {w,}. Prove that for 
any n, 


= 


<Pw;,w;> S 1. 
i=1 


l 


(d) Conclude that nc < 1, whence n is bounded. 


Let A be a hermitian operator on the Hilbert space E, and assume that the 
spectrum of A is the union of two disjoint closed sets S, T. Show that E 
admits a direct sum decomposition into two closed subspaces E,; and E; 
which are A-invariant, and such that, if we let A; and A, be the restriction of 
A to E, and E; respectively, then the spectrum of As is S and the spectrum 
of A, is T. (Cf. Exercise 2 of Chapter XVI.) 


Let A be a hermitian operator on a Hilbert space. If c is an isolated point of 
the spectrum, show that c is an eigenvalue. 


Show that an operator A on a Hilbert space is hermitian positive if and only 
if there exists an operator B such that A = B*B. 


Let S be a non-zero Banach subalgebra of operators on a Hilbert space E. 
Assume that S is *-closed (ie. if Ae S then A* eS), and that all elements of S 
consist of compact operators. Prove that there exists an S-irreducible sub- 
space (i.e. a subspace # 0 which has no S-invariant subspace other than 0 
and itself), and that E is the orthogonal sum of S-irreducible subspaces. 
[Hint: Writing A = B+ iC, where B, C are hermitian, you can find a hermi- 
tian element A in S such that A #0. Let A be an eigenvalue for A, and 
among all S-invariant subspaces, let M #0 be such that the eigenspace M, 
for A has minimal dimension. Let ve M, v #0. Prove that the closure of Sv 
is irreducible. | 


Give the details of the proofs in statements (a), (b), (c), (d) for Theorem 8.1. 


The next exercise gives a complement and refinement of Theorem 8.1 when we 


deal with an automorphism of the Hilbert space. 


Polar Decomposition of an Automorphism 


16. 


Let A: EE be an operator on a Hilbert space. 

(a) If A is unitary, then show that A™* is unitary. Also, A is hilbertian if and 
only if A*A=JI. If A, B are hilbertian, so is AB. In the language of 
algebra, hilbertian operators form a group, denoted by Hilb(E). 

(b) An operator A is said to be skew-symmetric if A* = —A. Since we work 
in the real case, we shall say that an operator is symmetric instead of 
hermitian, and let Sym(E) denote the space of symmetric operators on E. 
We let Sk(E) denote the space of skew-symmetric operators. Show that 
End(E) is a direct sum 


L(E, E) = Sym(E) @ SK(E). 
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(c) For all operators 4, show that the series 


A2 


converges, and if AB = BA, then 
exp(A + B) = exp(A) exp(B). 
For all operators A sufficiently close to the identity J, the series 


(4-1 


log A=(A-JD—- 
converges, and if AB = BA, then 
log AB = log A + log B. 


(d 


a’ 


If A is symmetric (resp. skew-symmetric), then exp A is symmetric positive 
definite (resp. hilbertian). If A is a toplinear automorphism sufficiently 
close to I and is positive definite symmetric (resp. hilbertian), then log A is 
symmetric (resp. skew symmetric). 

Show that the exponential map gives a homeomorphism from the space 
Sym(E) of symmetric operators of E to the space Pos(E) of symmetric 
positive definite automorphisms of E. Define its inverse. 

(f) Show that the space of toplinear automorphisms of the Hilbert space E is 
homeomorphic to the product 


~~” 


(e 


Hilb(E) x Pos(E) 
under the map given by 
(H, P)r> HP. 


[Hint: To construct the inverse, given an invertible operator A, we must 
express it in a unique way as a product A = HP where H is hilbertian, P 
is symmetric positive definite, and both H, P depend continuously on A. 
Show that 

P = (A*A)!/? and H=AP 7} 


exist and satisfy our requirements. | 


Hilbert—Schmidt Operators 


17. Assume that H has a countable Hilbert basis. An operator A is called 
Hilbert—Schmidt if there exists some Hilbert basis {u;} such that 


y. | Au;|? < 00. 
i 
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(a) Prove that the same convergence holds for any other Hilbert basis {v,}. 
For Hilbert—-Schmidt operators A and B, define their scalar product 


<A, BD = > <Au;, Bu;> 


with some Hilbert basis {u;}. 

(b) Show that the sum is absolutely convergent, and independent of the 
choice of Hilbert basis. Show that B*A is Hilbert—Schmidt. 

(c) Show that the Hilbert-Schmidt operators form a vector space, which 
therefore has the scalar product defined above. Denote the corresponding 
norm by 


N,(A) or |All., 
so that 
AZ = > |Au,l?. 


Prove the additional properties, where A, B denote Hilbert-Schmidt opera- 
tors, and X denotes an arbitrary operator. 


HS 1. ||A*||, = Alla. 
HS 2. XA and AX are Hilbert—Schmidt, and 


|XAll2 S|XIIAll2, = JAX Il, S|XIMAlle. 


HS 3. A Hilbert—Schmidt operator 1s compact. 
[For HS 3, use the projection on the finite dimensional spaces generated by 


U,,.+..,Uy for a finite number of u;.] 

HS 4. A + Biz — ||Allz — Bila = 2: Re<A, BD. 
HS 5. Re ¥ <Au;, Bu;> = Re )\ (A*u;, B*u;>. 
HS 6. <A*, B*») = <A, B). 


HS 7. (XA, B) = <A, X*B) and ~—s- AX, B) = <A, BX*). 


Trace Class Operators 
18. An operator is said to be of trace class if it is the product of two Hilbert— 


Schmidt operators, say A = B*C where B, C are Hilbert—Schmidt. For such 
operators A, define the trace of A: 


tr(A) = 9° <Au;, u;> = )) (Cu;, Bu;> = <C, B). 


The first sum shows that the trace is independent of the choice of B, C. 
Show: 


TR 1. [tr(A)] S ||BllailClle. 
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TR 2. If A is of trace class, so are AX and XA, and we have 
tr(AX) = tr(X A). 

TR 3. A is of trace class if and only if P, is of trace class, and 
tr(P,) = tr(P,+). 


TR 4. Let P be a symmetric positive operator. Then P is of trace class if and 
only if ) <Pu;, u;> converges. 


[Hint: Use P’/?.] If A is an operator of trace class, define 
N,(A) = |All, = tr Py = ||,P4/? 12. 
TR 5. The operators of trace class form a vector space; the function 
At |All, 


is a norm, satisfying ||A||, = ||A*ll,.- 
[Hint: Write P,,,; = U*(A + B) where U is a partial isometry. ] 
TR 6. If A is of trace class, so are XA, AX, and we have 


|XAll, SIX|HAll, and = [AX], S/XIATh. 


TR 7. If A is of trace class, then |tr A| S ||A|l,. 


TR 8. Let T, be a sequence of operators on H converging weakly to an operator 
T. In other words, for each v, we H, suppose that <T,,v, w> > <To, w>. 
Let A be of trace class. Then 


tr(TA) = lim tr(T,,A). 


[Hint: Assume first that A = P is positive symmetric. Since A is compact 
(because A is Hilbert-Schmidt and HS 3), there is a Hilbert basis of H 
consisting of eigenvectors {u;}, with Au; =c,u;. You will now need the uni- 
form boundedness theorem to conclude that the norms T;, are bounded. Then 
use the absolute convergence 


Y: lei] < 0 


to prove the assertion in this case. In general write the polar decomposition 
A = UP, so TA =(TU)P and you can apply the first part of the proof. ] 
Note: For complete proofs, cf. [La 3], Appendix of Chapter VII, and [Sh]. 


CHAPTER XIX 


Further Spectral Theorems 


In this chapter, we use the spectral theorem of Chapter XVIII to give a 
finer theory, making sense of the expression f(A) when f is not continu- 
ous. Ultimately, one wants to use very general functions f in the context 
of measure theory, namely bounded measurable functions, as a corollary 
of what was done in Chapter XVIII. For our purposes here, we deal 
with an intermediate category of functions, essentially characteristic func- 
tions of intervals. These give rise to projection operators, whose formal- 
ism is important for its own sake. We also want to deal with unbounded 
operators as an application. 

This chapter may be omitted, and is included only for those who want 
to go into spectral theory a little more deeply, as in the next chapter. 
The development of spectral measures gives a good example of how 
measure theory and the general functional analysis of this chapter can be 
put together. 


XIX, §1. PROJECTION FUNCTIONS OF OPERATORS 


We need to extend the notion f(A) to functions f which are not continu- 
ous, to include at least characteristic functions of intervals. We follow 
Riesz—Nagy more or less. We let H be a Hilbert space. 


Lemma 1.1. Let a be real, and let {A,} be a sequence of hermitian 
operators such that A, 2 al for all n, and such that A, 2 A,+,. Given 
v € H, the sequence {A,v} converges to an element of H. If we denote 
this element by Av, then v+> Av is a bounded hermitian operator. 
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Proof. From the inequality 
<A,v, VU) 2 aD, VD 
we conclude that <A,v, v> converges, for each v € H. Since 
<A,v, W) = 2¢A,(v + w), v + W> — 3¢A,(v — W), 0 — WD, 


it follows that <A,v,w> converges for each pair of elements v, we H. 
Define 
2,(w) = lim <A, v, w>. 


no 


Then 4, is antilinear, and |<A,v, w>| S Cl|v||w| for some C and all »v, 
w €H. Hence there exists an operator A such that 


«Av, w> = lim ¢A,v, w). 


Since <A,v, w> = <v, A,w), it follows that A is hermitian. 


Lemma 1.2. Let f be a function on the spectrum of A, bounded from 
below, and which can be expressed as a pointwise convergent limit of a 
decreasing sequence of continuous functions, say {h,}. Then 


lim h,(A) 


hoo 
is independent of the sequence {h,}. 
Proof. Say g,(t) decreases also to f(t). Given k, for large n we have 
max(g,,h,) Sh, + &, 
by Dini’s theorem, so for all t we have g,(t) S h,(t) + ¢, and hence 


9n(A) S h,(A) + el. 
This shows that 
lim g,(A) S h,(A) + el, 
and therefore that 
lim g,(A) S lim h,(A) + el. 


This is true for all ¢. Letting ¢>0 and using symmetry, we have proved 
our lemma. 


466 FURTHER SPECTRAL THEOREMS [ XIX, §1] 


From Lemma 1.2, we see that the association 


fr f(A) 


can be extended to the linear space generated by functions which can be 
obtained as limits from above of decreasing sequences of continuous 
functions, and are bounded from below. The map is additive, order pre- 
serving, and clearly multiplicative, 1.e. 


(f9)(A) = f(A)g(A) 


for f, g in this vector space. 

The most important functions to which we apply this extension are 
characteristic functions like the function w,(t) whose graph is drawn in 
Figure 19.1. It is a limit of the functions h,(t) drawn in Figure 19.2. 


Wet) 


Cc 


Figure 19.1 


Figure 19.2 


Lemma 1.3. Let W.(A)=P.. If al < AS BI, then: 


i) RD =Oif c<a, and P.=1 if c2 B. 
(un) If csc’, then P< P.. 


Proof. Clear from Lemma 1.2. 


Observe that we also have P? = P,, ie. that P. is a projection. We call 
{P.} the spectral family associated with A. 
We keep the same notation, and we shall make use of the two functions 


f., g. whose graph is drawn in Figure 19.3. Thus f(t) + g,(t) = |t — cl. 
We have 


(¢—c)(1 — Y(t) = £0). 
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S(t) &(t) 


Figure 19.3 
Hence 
(1) (A — cI(I — PF) = f(A), 
(2) A —clI = f,(A) — g.(A), 
(3) (A —cl)F. = —g(A)PR = —g(A). 


Theorem 1.4. Let P. be the spectral family associated with A. If b <c, 
then we have 


bISs A Sc, on Im(P. — B). 
Proof. From (1) above, we have A — bI = f,(A) on the orthogonal 


complement of P,, whence the inequality bI < A follows on this comple- 
ment since f, 2 0. From (3) above, we have 


A —clI = —g,(A) 


on the image of P., and since —g, is < 0, we get A <clI on this image. 
This proves our theorem. 


Theorem 1.5. The family {P.} is strongly continuous from the right. 
Proof. Let ve H. Our assertion means that P.,,v- Pv as e-0. It 


suffices to prove that 


CP.460, v»> - (Pv, v» 
because 
((P.+,— Pv, v> = \(P.4, — Pv)’. 


Let h,(t) be the function whose graph is shown in Figure 19.4. We have 


w(t) ms We+e(t) <S h,(t) 
and 
P.< P., Sh,(A). 
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c cte ctbte 


Figure 19.4 


In other words, we have 
(Pv, v> S (Pose v> s ch,(A)v, v>. 


We let e>0. Then h,(t) decreases to ,(t) and <h,(A)v, v> decreases to 
<P.v, v>, which completes the proof of the theorem. 


Theorem 1.6 (Lorch). From the left, 


lim (P. ~ P._,) = Q. 


e>0 
is the projection on the c-eigenspace of A. 
Proof. Using Theorem 1.4, we have 


(c _ £)(P. ~~ P._,) s A(P. _ P._,) < c(P. ~ P._,) 
whence 
(A — cl )(F. — F-.)| Se. 
But for each v, lim,.9(P.— P-_,)v exists, say =w. It follows that 
Aw = cw, i.e. QO, maps H into the c-eigenspace. 


Conversely, if @ is a continuous function, then for any A-invariant 
closed subspace F,, we have 


P(Alf) = p(ADlf- 
We want to show that Q, is the identity on the c-eigenspace, and without 
loss of generality we may therefore assume that H = H, is the eigenspace. 
Then P. = 0 because f. = 0 on the spectrum of A. If b <c, then 
f,(A) = A — bI =(c — D)I 


is invertible, and hence P, = 0. This proves Lorch’s theorem. 


[ XIX, §2] SELF-ADJOINT OPERATORS 469 


XIX, §2. SELF-ADJOINT OPERATORS 


Let H be a Hilbert space and A a linear map, 
A:D,->H 


defined on a dense subspace. Consider the set of vectors v € H such that 
there exists w € H such that 


<u, W> = Au, v), all ue D,, 


or in other words, <u, w> — <Au, v> = 0. The set of such v is the projec- 
tion on the first factor of the intersection of the kernels of 


(v, w) > Cu, w> — <A, vd, ue D,. 


It is a vector space. To each v in this vector space there is exactly one 
w, if it exists, having the above property, because 


ut><Au, v> 


is a functional on a dense subspace. Hence we can define an operator 
A* by the formula 


A*v = w, 
on the space D,« of such vectors v. We call the pair (A*, D,«) the adjoint 
of A. 
Let J:H x H->HxH be the operator such that J(x, y) =(—y, x). 


Then J? = —I. We note that the graph G, of A* is given by the 
formula 


G,+ = (JG,)°, 
where 1 denotes orthogonal complement, and hence the graph of A* is 
closed. 
We say that A is closed if its graph G, is closed. 
If A is closed, then D,« is dense in H. 


Proof. Let he Djx, so 


(h, 0) € (Gyx)~ = (JG4)~~ = JG, 
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because we assumed that A is closed. We conclude that (0, h)eG,, and 
hence h = 0, proving our assertion. 


If A is closed, then A** = A. 

Proof. Gyx+ = (JG4x)* = (J(JG,)*)* = G. 

If D, and D,« are dense, then G4x« = closure of Gy. 

Proof. Gy+« = (JG,»)+ = (J(JG,)+)+ = G. 

If A is defined on D, and B is defined on Dg, if D, < Dg, and if the 
restriction of B to D, is A, then one usually says that A is contained in B, 
and one writes A < B. The above assertion shows that A c A**. 

We say that A is symmetric if (Au, v> = <u, Av> for all u, ve Dy. We 
say that A is self-adjoint, A = A*, if in addition D, = Dy«. 

If A is symmetric, then A < A*. 

This is clear. Recall that we assumed D, dense in H. 


If A, B are self-adjoint and A < B, then A = B. 


This is also clear, because in general B* < A*, so in the self-adjoint 
case, B c A, whence A = B. 


Let A be symmetric, defined on D, dense as above. Let AeC not be 
real. Then A — AI is injective on D,, because from 


Au = du and «Au, u> = <u, Au) 
we conclude 


Adu, u> = (Au, ud = <u, Aud = AKU, ud, 
so u=0. Hence we can define an operator 
U=U,,=(A+AN(A + Al 


on the image (A + AI)D,. We contend that U is unitary. This amounts to 
verifying that for u, ve D4, we have 


<Au + Au, Av + Avy = (Au + Au, Av + dv), 


which is obvious. 
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Lemma 2.1. If A is symmetric, closed, and A4€C is not real, then 
(A + AID, is closed. 


Proof. Let {u,} be a sequence in D, such that {(A + ADu,} is Cauchy. 
Since U is unitary, it follows that 


{(A + Alu,} 


is also Cauchy, hence {(A — A)u,} is Cauchy, and {u,} is Cauchy, say 
converging to u. But _ 
{2Au, + (A + A)u,} 


is Cauchy, whence also {Au,} is Cauchy. Since the graph of A is 
assumed closed, we conclude that {(u,, Au,)} converges to an element 
(u, Au) in the graph, and the sequence 


{(A + ADu,} 
converges to (A + Al)u. This proves that (A + AJ)D, is closed. 


Theorem 2.2. Let A be symmetric, closed with dense domain. Let Ae C 
be not real, and such that (A + AI)D, and (A + ADD, are dense (whence 
equal to H by the lemma). Then A is self-adjoint. 


Proof. Let v € D,«. It suffices to show that v € D,. We have by defini- 
tion 
«Au, v> = <u, A*v>, all ue D,. 


Since (A + AI)D, = H, there exists u, € D, such that 


A*v + Av = Au, + dAu,. 
Then 
«Au, v> = Cu, Au, + Au, — Av), all ue D,, 
whence 
((A+ADu,v>=<(A+ADuu,>, all we D,. 


This proves that v = u,, as was to be shown. 


Remark. In the literature, you will find that the dimension of the 
cokernel of (A + AI)D, is called a defect index. We are concerned here 
with a situation in which the defect indices are 0. 


Corollary 2.3. Let A be symmetric with dense domain. Let 1€C be not 
real, and such that (A + AI)D, and (A + AI)D, are dense. Then the 
closure of G, is the graph of an operator which is self-adjoint. 
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Proof. Since A is symmetric, the domain of A* is also dense, and we 
have shown above that G,«« is the closure of G,, so A has a closure. 
It is immediate that this closure is also symmetric, and the theorem 
applies. 


An operator A defined on D, is called essentially self-adjoint if the 
closure of its graph is the graph of a self-adjoint operator. The corollary 
gives a sufficient condition for an operator to be essentially self-adjoint. 


Theorem 2.4. Let A be a self-adjoint operator. Let zéC and z not 
real. Then A — zI has kernel 0. There is a unique bounded operator 


R(z) =(A—2zl)1:H>D, 


which establishes a bijection between H and D,, and is the inverse of 
A —2zlI. We have 


R(z)* = R(2). 
If Im z, Im w #0, then we have the resolvent equation 
(z — w)R(z)R(w) = R(z) — R(w) = (z — w)R(w)R(2), 
so in particular, R(z), R(w) commute. We have |R(z)| < 1/|Im z]. 
Proof. Let z=x+iy. If u is in the domain of A, then 
(A — 2I)ul? = \(A — xI)ul? + y*|ul? 2 y?|ul? 


because A is symmetric, so the cross terms disappear. This proves that 
the kernel of A — zI is 0, and that the inverse of A — zI is continuous, 
when viewed as defined on the image of A — I. If v is orthogonal to 
this image, 1.e. 


«Au — zu, v)> =0 


for all ue D,, then <Au,v> = <u, Zv>, and since A is self-adjoint, it 
follows that v lies in the domain of A and that Av = Zv. Since the kernel 
of A—ZI is 0, we conclude that v = 0. Hence the image of A — ZI is 
dense, so that by Lemma 2.1 this image is all of H and R(z) is every- 
where defined, equal to the inverse of A — zJ. We then have 


[(A — wl) — (A — zI) | R(w) = (2 — w)R(w). 


Multiplying this on the left by R(z) yields the resolvent formula of the 
theorem, whose proof is concluded. 
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We write 
R(i) = (A —il)71 =C + iB 


where B, C are bounded hermitian. From the resolvent equation between 
R(i) and R(—i) we conclude that B, C commute. We may call B the 
imaginary part of (A — iJ)~', symbolically 


B= Im(A — il)". 


Lemma 2.5. With the above notation, we have C= AB and BA c AB. 
The kernel of B is 0, and O < BSI. 


Proof. We have from R(i)* = R(—i) that 
(A —il)' —(A + il = 2iB. 
We multiply this on the left with A, noting that 


A(A —il) 1 =i(A—il14+1 
and 
A(A + il = —i(A+iDiUsl. 


We then obtain C = AB. For BA we multiply the first relation on the 
right by A, so that we use 


(A — il)(A — il) = Ip, 


and similarly for A + iI. The relation BA < AB follows. The kernel of B 
is 0, for any vector in the kernel is also in the kernel of C = AB, whence 
in the kernel of (A —il)™', and therefore equal to 0. We leave the 
relation B 2 O to the reader. That B </ follows from |R(i)| $ 1, a spe- 
cial case of the last inequality in Theorem 2.4. 


We now give an example of a self-adjoint operator. It will be shown 
later that any self-adjoint operator is of this nature. 


Theorem 2.6. Let {H,} be a sequence of Hilbert space. Let A, be a 
bounded self-adjoint operator on H,. Let H be the orthogonal direct 
sum of the H,, so that H consists of all series )\u, with ¥\|u,|? < oo. 
There exists a unique self-adjoint operator A on H such that each H, 
is contained in the domain D, and such that the restriction of A to H,, is 
A,. Its domain is the vector space of series u = )\u, such that 


> |AnUnl? < 0, 
and Au=) A,u,. 
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Proof. The uniqueness is clear from the property that if A, B are 
self-adjoint and A cB, then A= B. It suffices now to prove that if we 
let D, be the domain described above, and define Au by ) A,u,, then A 
is self-adjoint. It is clear that A is symmetric. Let v € D4». Then 


<u, A*v> = (Au, v», all ue D,. 
Say u=)u,. Then 


> <u,, A*v> = > (Au, vd. 
If ue H,, then 
<u,, A*v) = (Au,, Uv), 


(Un, (A* 0), > = (Ags Un); 
whence (A*v), = A,v,. Then 
> IAneal? =D (A*O)l? = LA*O/’, 


whence ve D,, so D,« c D, and A is therefore self-adjoint. This proves 
the theorem. 


In the situation of Theorem 2.6, we use the notation 
A=QA,. 


We deal with the converse of Theorem 2.6. Let A be an arbitrary 
self-adjoint operator on the Hilbert space H, and let 


(A —il)'=C+iB 
as above. 
We are in a position to decompose our Hilbert space by means of B. 


Let 0. be the function whose graph is given in Figure 19.5, and which 
gives rise to a projection operator. 


O otherwise 


Figure 19.5 
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Let {P} be the spectral family for B, and let 
0, = 6,,(B) = Pin _ Pi jin+1): 
Then Q,, is a projection operator, and we let 


HA, = Q,H = Im Q,,. 
Then 


H=@Q@4H, 


is an orthogonal direct sum. In fact, let 0 and y be the functions whose 
graphs are shown in Figure 19.6(a) and (b) respectively. 


A(t) n(t) 
] ] 
(a) (b) 


Figure 19.6 


Then 1 — 0 =n and n(B) = 0 because the spectral family for B 1s continu- 
ous at 0, in view of Lemma 2.5 (kernel B=0) and Lorch’s theorem 
(Theorem 1.6). 

Let s,(t) be the function whose graph is shown in Figure 19.7. Then 


Bs,(B) = 8,(B) = Qn. 


O otherwise 


Figure 19.7 


Theorem 2.7. Let A be a self-adjoint operator and let B = Im(A — il)*. 
Let Q, = 0,(B) be the projection operator defined by the function 06, 
above. Then A is defined on Im Q,,, and 


0,4 <— AQ, = S,(B)C. 
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Let H, =Q,H. Then H is the orthogonal direct sum of the spaces H,, 
the restriction of A to H, is a bounded operator A,,, and 


A=@QA,. 


Proof. Since ts,(t) = 6,(t), we get Bs,(B) = 6,(B) = Q,. Then by Lemma 
2.5 


AQ,, = ABs, (B) = Cs,(B) = s,(B)C. 
In particular, AQ, 1s everywhere defined. On the other hand, 
Q,A = s,(B)BA c s,(B)AB c s,(B)C. 
This proves that 0,4 <c AQ,. It means that given ve D,, if 


v=) 0, 
is the decomposition of v according to the space H,, and if 


Av=) w,, 
then 
Q, Av = w, = AQ,v = Av,. 


So Av = ) Av,, and the theorem is proved. 


XIX, §3. EXAMPLE: THE LAPLACE OPERATOR 
IN THE PLANE 


We shall give a typical example of an unbounded symmetric operator. 
We shall assume that the reader is acquainted with some notions of 
advanced calculus, and in particular with Stokes’ theorem in the plane. 
These notions are treated later in this book, but to give examples, one 
has to use something concrete, taken possibly from other courses. We let 
(x, y) be the variables of R? and we let 


o? o? 
~ 8x2 7 By? 


be the Laplace operator. We let 
L=-—A 


in order that L turn out to be a positive operator. 
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If U is a region in the plane with piecewise smooth boundary, and 
F =(f,g) is a smooth vector field on U, we have the Stokes—Green 
theorm in dimension 2, 


I) div Fax dy = | F -n ds, 
U Bd U 


where Bd U is the boundary of U with the appropriate orientation. 
Letting 


F=g-gradf=(gf.,9f,) or f-gradg =(f9:, f9,), 


we obtain the formula 


0 6) 


If f, g have compact support, and U is the whole plane, then there is no 
boundary, and the term on the right in this last formula is equal to 0. 
We shall use the scalar product for f, g € C(R) given by 


i g9> = {| f(x, y)g(x, y) dx dy. 


Consider now L or A to be operators on the space C°(R*) of C®- 
functions with compact support. Then L is of course not bounded, it is 
just a linear map. To avoid putting complex conjugates, assume that the 
functions f, g are real valued in this space. Then we find: 


L 1. The operator L or A is symmetric, that is 


<Lf, g> = <f, Lg>. 


This comes from the Stokes—Green formula with no boundary, as we just 
observed. In addition, we have the property: 


L 2. The operator L is positive, and in fact we have 


40h [ Geo 


To prove this second formula, consider the differential form 


,) F) 
w(x, y) = of dy — of dx. 
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wb) foot (Soo) foo 


whence the second formula follows by the standard form of Stokes’ 


theorem, namely 
| | dw = | O. 
U Bd U 


Considering L as an operator on functions on the unit disc, say, or on 
the plane with appropriate behavior at infinity to insure convergence, one 
can then show that L is a self-adjoint operator. For the analogous 
theorem on the upper half plane, cf. SL,(R). 

It is a technical and not trivial matter to give an explicit representa- 
tion for the resolvent in terms of classical integrals. We don’t go into 
this here. However, we do mention one other object associated with a 
situation like the above, namely a fundamental solution. Let z = (x, y), 
and define 


Then 


1 
G(z, z’) = x log|z — z’|. 


Then G is symmetric in z, z’ and is C® except on the diagonal z = 2’. 
Furthermore, the (improper) integral of log|z| exists on any compact 
subset of the plane, because the function logr is locally integrable near 
r = 0, and 

dx dy =r dr dé, 
where (r, 0) are polar coordinates. (No fancy integration is needed here, 
but in the language of integration, we could say that logr is locally L’.) 


Now we may view G as defining an integral operator 


Sg: C2(R*) > C°(R’) 
by the formula 


Sef(2z') = | G(z, 2) f(z) dz, 
R2 
where dz = dx dy for simplicity. This integral can also be written 
~ | (log|w))f(w + 2") dw 
an |e: g w+z ; 


Standard theorems from advanced calculus justify the fact that you can 
differentiate under the integral sign because f is assumed to be smooth 
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with compact support, thus showing that S,f is C®. (Cf. Lemma 2.2 of 
Chapter VIII.) If D is the open disc of radius 1, then it is immediate that 
Sg maps C,°(D) into BC®(D), namely that S,f is bounded. 

We now have the fundamental formula: 


L 3. S¢ oO A => I, 
where I is the identity. Suppose we fix z’, and let 
g(z) = G(z, z’) 


viewed as a function of z only. Then the formula asserts that 


If gAf dx dy = f(z’). 


For the proof, let r = |z — z’|, and view f in terms of the polar coordi- 
nates (r, 0). Apply the Stokes—Green formula to the region U(e) outside a 
circle of radius ¢, so that the boundary is the circle S(e) of radius ¢ with 
reversed orientation. We have ds =d0 if we parametrize the circle of 


radius é by 
0 
( Cos (*), € sin (*)). 0<@6@< 2ne. 


Also, Ag = 0. The right-hand side of Green’s formula gives 


og of _ ame 1 2ne | af 
I. (1% ~~ 0) ds = [ f(é, 0) 5 do — \ 7, (los é) = dé. 


As ¢— 0, the first term goes to f(0,0) and the second term goes to 0. 
Hence the desired formula follows. 


We see that we have inverted the Laplace operator in some fashion, 
but only by a function with a logarithmic singularity, and definitely 
unbounded. Such a fundamental solution is, however, useful for con- 
structing other solutions, or for constructing the resolvent. We don’t go 
into this here. Cf. for instance, Folland [Fo]. The resolvent can be 
represented by a kernel in terms of Bessel functions. 


CHAPTER XX 


Spectral Measures 


In the spectral theorems of Chapters XVIII and XIX, we defined func- 
tions of an operator f(A) with continuous functions first, and then essen- 
tially characteristic functions of an interval by a limiting process. If v is 
a given vector, then the association 


fr><flA)y, vp 


defines a functional on C,(R). But from Chapter IX we know that such a 
functional determines a unique measure, which is thus associated with the 
operator A and the vector v. This measure is called a spectral measure. 
The point of this chapter is to reformulate the results of previous chap- 
ters in terms of measures, and to extend the meaning of f(A) to cases 
when f is Borel measurable. In this way, we put together a lot of 
previous material, which is thus put to work: the spectral theorem of 
Chapter XIX; measures on locally compact spaces in Chapter IX; convo- 
lutions and Dirac sequences (families) in Chapter VIII. 


XX, §1. DEFINITION OF THE SPECTRAL MEASURE 


We first state formally as a theorem the measure associated with the 
functional mentioned in the introduction. 


Theorem 1.1. Let A be a bounded hermitian operator on a Hilbert 
space H. Let ve€ H. There exists a unique positive measure pw, = My, 
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on R such that for every m € C,(R) we have 


<@(A)v, v> = {, p dy,. 


This measure is finite, and especially 


Holl = Ho(R) S |oI?. 


Proof. Let 4, be the functional on C,(R) defined by 


Ap) = CplA)2, v>. 


If g = 0, then (A) = O by Theorem 4.4 of Chapter XIX. Hence 4, is a 
positive functional. The existence and uniqueness of the measure yp, is 
then a special case of Theorem 2.7 of Chapter IX. Furthermore, by 
Theorem 4.4 of Chapter XIX we have 


<(A)v, v> S llelloll?, 


thus giving the desired bound for the measure y,, and concluding the 
proof. 


The measure y, is called the spectral measure associated with A and v. 
By polarization, for v, we H we see the existence and uniqueness of a 
complex measure ,,, such that 


<@(A)v, w> = | p du, forall geC,(R). 
R 


It is clear that yw, ,, is C-linear in v and anti-linear in w. Furthermore, we 
again have a bound 


IC P(A)», w>| S [lolol lw, 


whence by Theorem 4.2 of Chapter IX we also obtain the bound 


Moll S lol lw}. 


The measure p,,, is also called the spectral measure associated with A, », 
w. Applying the defining formula for yw, to a real valued function @, we 
see immediately that 


Low = Lw,v . 
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Let BM(R) be the Banach space of bounded (Borel) measurable func- 
tions on R. For each f € BM(R) the association 


(v, w)r> {7 Eby 


is linear in v and anti-linear in w. Furthermore 


| f dp, 


as one sees by applying the dominated convergence theorem to a se- 
quence {g,} in C,(R) approaching f pointwise almost everywhere with 
respect to the measure |y,,|, and such that |g,| S ||f||,,.. Thus our asso- 
ciation is continuous, and there exists a unique bounded operator, which 
we denote by f(A), such that 


S If lllvllwl, 


(f(A)v, w> = | f dp, wy. 
The following properties are then satisfied for f, g €e BM(R). 


SPEC 1. (f9)(A) = f(A)g(A). 


SPEC 2. f(A)* = f (A). 
SPEC 3. If f, is the function f,(t) = 1, then f,(A) = I. 


SPEC 4. If the functions f(t) and g(t) = tf(t) are bounded measurable, 
then g(A) = Af(A). 


SPEC 5. We have |f(A)| S |lf ||... Furthermore, if {f,} is a bounded 
sequence in BM(R) converging pointwise to f, then {f,(A)} 
converges strongly to f(A). 


Properties SPEC 1 through SPEC 4 are special cases of the spec- 
tral theorem, as formulated in Theorem 4.4 of Chapter XVIII, in case 
f, 9g € CR), and so is the bound 


IF(A)! S WF lle 


in that case. The properties for f¢BM(R) then follow by applying 
the dominated convergence theorem and taking limits. For instance, to 
prove SPEC 1, fix ye C,(R) and let {@,} be a bounded sequence in C,(R) 
converging to f pointwise almost everywhere with respect to the positive 
measures 


[Hy Ayu,w| and [Howl 
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We obtain 


(P(A) (A)v, w> = | PnW du,» 


= {o. Ay Ayv,w> 


which converges to 


{7 W diyw = <(fw)(A)v, w> 


by the dominated convergence theorem if we use the first expression on 
the right, and also converges to 


\7 AM Ayv,w = <f(AW(A)», w> 


if we use the second expression on the right. This takes care of one 
factor. We take care of the other by using a sequence {y,} converging to 
g in the same manner as above. This proves SPEC 1, and also proves 
the equivalent formula 


SPEC 6. | 7 Ay w = \7 AbgAyv.w- 


We wish to extend the above results to unbounded operators. 


Theorem 1.2. Let A be a self-adjoint operator. There exists a unique 
association f++f(A) from BM(R) into the bounded operators on H 
satisfying SPEC 1 through SPEC 5. 


Proof. By Theorem 2.7 of Chapter XIX there exists a direct sum 
decomposition 


H=Q@H, and A=QA, 


where A, = A|H, is the restriction of A to H, and is bounded self-adjoint. 
Let fe BM(R) and ve H, 


v=) Y. 


Since | f(A,)v,| < || f\l,v,|, there is a unique bounded operator f(A) such 
that 


f(Av=)> f(A,)v, for all ve dH. 
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To each H, and v, € H, we can associate the measure yp!” as above. Since 


I< f(A) On, Un >| S Mf lleol val’, 


the series 


», <f(A) Pn, Pn 


converges absolutely, and defines a positive functional on C,(R) (even on 
BM(R)). Therefore: 


Proposition 1.3. Given a direct sum decomposition as above, there exists 
a unique positive measure wu, such that for all feC (R) we have the 
formula 


| f du, =¥ f(A > =F | fay. 


The formalism of the five SPEC properties extends at once to the case of 
an unbounded operator A. For example, in the case of SPEC 4, note 
that 


[Af(A)Un| S lIglloltnl, where g(t) = tf(0). 


It follows that 
»y f(A)u, E D, ) 


where D, is the domain of A, as in Chapter XIX, Theorems 2.6 and 2.7. 
Hence SPEC 4 1s valid. 

Uniqueness will be proved in the next section, as an application of 
Dirac families. 


We defined the measure yu, non-canonically, in a way seemingly depen- 
dent on the decomposition of the Hilbert space into a direct sum such 
that A restricts to a bounded operator on each summand. Of course, the 
measure can be characterized intrinsically as follows. 


Theorem 1.4. Let A be a self-adjoint operator on H. Let ve H. The 
measure [, is the unique positive measure yw such that for all 9 € C,(R) 
we have 


<@(A)v, v> = [. p du. 


Proof. Assuming the uniqueness in Theorem 1.2 to be proved in the 
next section, the present theorem is merely a special case of Theorem 2.7, 
Chapter IX (associating measures to functionals). 
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Example. This example is the unbounded analogue of the example 
given in Chapter XVIII, §4. Let A be self-adjoint, and assume that there 
exists a positive number c such that 


<Av, v» = —e|p|? forall ved. 


Then there exists a unique differentiable mapping K: R* — End(H) sat- 
isfying the following conditions: 


H 1. The image of K is contained in the domain of A, and 
d 
— K(s) = —AK(s). 
ds 


H 2. For each vé€ H we have lim K(s)v = v. 


sO 


The proof is an exercise, which can be carried out by using the case for 
bounded operators, and the direct sum decomposition of Theorem 2.6 
for unbounded operators. I took the statement from Faltings [Fa 1], 
Lemma 3.4. Readers can find another idea for the existence proof in this 
reference, which also proves the uniqueness. 

Observe that for each positive s, the function 9,(t)=e “ is bounded 
on the spectra of the bounded operators A, on the components H,, and 
sufficiently uniformly so that there is no difficulty in handling the exis- 
tence part of the proof by considering its effect on infinite sums )) v,. As 
in Proposition 1.3 the existence proof by this method is not invariant, 
but the uniqueness show that the end result @,(A) is independent of the 
direct sum decomposition of H. One may write 


K(s) = e7*4, 


and K(s) is called the heat operator associated with 4. 


XX, §2. UNIQUENESS OF THE SPECTRAL MEASURE: 
THE TITCHMARSH-KODAIRA FORMULA 


The uniqueness proof of this section provides a substantial example of 
the use of Dirac families, with weaker conditions than have been men- 
tioned previously. For our purposes here, we define a Dirac family to 
be a family {o,} (¢ >0) of L'-functions on R satisfying the following 
properties: 


DIR 1. We have @g, 2 0 for all «. 


486 SPECTRAL MEASURES [XX, §2] 


DIR 2. For all ¢, we have 
| o,(x) dx = 1. 
R 


DIR 3. Given 6 > 0 and 6’ > 0, we have 


—6 oa) 
| + | QO, < 0’ 
— a é 


for all e sufficiently close to 0. 


Theorem 2.1. Let {g,} be a Dirac family satisfying DIR 1, DIR 2, 
DIR 3. Let h be bounded measurable on R. Then o,*h converges 
uniformly to h as «+0, on every compact set where h is continuous. 


Proof. Same as in Chapter VIII, Theorem 3.1. 


Suppose given an association f+» f(A) satisfying the five spectral prop- 
erties. For each v, we H there is a unique measure py, ,, such that 


<f{A)v, w> = {7 diy,w- 


Let z be complex and not real. The function f(t) such that 


1 
t—Z 


I() = 


is bounded measurable, and tf(t) is bounded. Also (t — z)f(t) = 1. Hence 
(A — zI) f(A) =I. 


This means that the resolvent has the integral expression 


((A _ zI) ‘0, w> = | —_ du, ,(t). 
ri -—Z 


We write yu, instead of u,,. Note that y, is a positive measure. 


Theorem 2.2. Let A be a self-adjoint operator on H and let ve H. Let 
R(z) =(A — zl)" for z not real. For any yw € C,(R) we have 


| W(A) du,(A) = lim _ | <[R(A + is) — R(A — ie)]v, v> WA) da. 
R 2-0 271 JR 
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If 2, <4, are real numbers which have p,-measure 0, then 
Az 1 A2 
| du,(A) = lim — | <ER(A + ie) — R(A — is) ]v, v> da. 
Ay e-0 2ni At 


The proof is based on the following lemma. 


Lemma 2.3. Let pu be a positive regular measure on R such that p(R) is 
finite. Then for We C,(R) we have 


1 00 00 ie. @) 
lim — [ [ ea pra et tH da = [ wa) dd, 


Furthermore, if A, <A, are real and such that the set {A,,1,} has p- 
measure Q, then 


1 A2 [% g A2 
in|, | owe mom |" awn 


1 


Proof. First observe that the family of functions 


1 6 
= Pte 
is a Dirac family on R for e->0. The left-hand side integrals in our 
lemma can be written 


| ° | * p(t — AYA(A) dult) ad 


where h is either w or the characteristic function of the interval [/,, 4, ]. 
We apply Fubini’s theorem to see that this expression is equal to 


| * g,«h(t) du(t) 


— 00 


Note that g,*h is bounded, and converges pointwise to h if h= wy, and 
pointwise to h except at the end points 1,, 4, in the other case. Since 
we picked our interval so that the end points have y-measure 0, we can 
apply the dominated convergence theorem to conclude the proof. 


The lemma obviously proves Theorem 2.2, because 


1 1 2ié 


t—-A—ie t—At+ie (t—A)*> +67 
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Furthermore, Theorem 2.2 provides the desired uniqueness left hanging 
in the last section, because it gives the value of the measure entirely in 
terms of the resolvent and Lebesgue measure, as on the right-hand side 
of the first formula on elements of C,(R). 

It is possible to develop the spectral theory by starting with a dir- 
ect proof of Theorem 2.2, showing that the limit on the right-hand 
side exists. One then defines the spectral measure as that associated 
with the corresponding functional, and one proves the other properties 
from there. Cf. Akhiezer—Glazman, Theory of Linear Operators in Hil- 
bert Space, Translated from the Russian, New York, Frederick Ungar, 
1963, pp. 8 and 31. 


XX, §3. UNBOUNDED FUNCTIONS OF OPERATORS 


In the first two sections, we studied bounded functions of an operator, 
and this operator could be bounded or unbounded. But the values f(A) 
were bounded. We shall now extend this definition to arbitrary Borel 
measurable functions f, and in this way recover A itself as an integral. If 
A is unbounded, then we shall see that A = f(A) where f(t) =t; and of 
course, t is not a bounded function of t. 


Theorem 3.1. Let A be a self-adjoint operator. Let f be a real valued 


Borel measurable function on R. Then there exists a unique self-adjoint 
operator f(A) such that: 


(i) The domain of f(A) consists of those v€ H for which 


fe L*(u,). 


(ii) For all v in the domain of f(A), we have 


Cf{A)v, v> = {7 dy. 


(iii) |f(A)o|? = \r diy. 


(iv) If f is bounded, then f(A) has the previous meaning, and if f(t) = t, 
then f(A) = A. 


Proof. Observe first that the integral in (11) exists by the Schwarz 
inequality. To prove the theorem, let 


f(t) ifn< fj sn+1, 
0 otherwise. 


no =| 
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Let ¥, =f '((n,n+1]). Thus f, is equal to f on Y, and 0 outside Y,. 
Let y, be the characteristic function of Y, and let E, = y,(A). Then E,, 1s 
a projection operator. Let 


H, = E,H. 


Then the H, are clearly mutually orthogonal, and we contend that we 
have a direct sum decomposition 


H=Q4H, 


as in Theorem 2.6 of Chapter XIX. 
To see this, note that 


N 
yy nl 


n=—-—N 
as N > oo, and hence 


N 

»y E, 71 

—N 
strongly. 


Let B, = f,(A), so that B, is bounded, and operates on H, through the 
projection on H,, because f, 7, = Xnt, = fn, whence 


In(A) Xn(A) = Xn(A) f(A) = fal A). 


Let f(A) = B be the self-adjoint operator whose domain is the usual one, 
consisting of v = } v, with v, ¢ H, and > |B,v,|7 < oo. Then 


F CSflAogs f(A)o,) = | f2 du, 
= |S da 


by the monotone convergence theorem. It follows that fe Y?(py,). The 
converse is similarly clear. This proves (ii). Also we get 


<f(A)v, v> => <f,(A)v,, 0,> 
=3 | du 


= {7 di, 


This proves everything except the final assertion that if f(t)=t, then 
f(A) = A. But this follows from the fact that f,(A) 1s equal to A re- 
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stricted to H, = E,,H, since f,(A) and y,(A) have the usual meaning, as in 
§1. This concludes the proof of the theorem. 


XX, §4. SPECTRAL FAMILIES OF PROJECTIONS 


By a spectral family in a Hilbert space H we mean a family of ortho- 
gonal projections {P,}, te R, satisfying the following conditions: 


SF 1. If a<b, then P, < B. 
SF 2. lim P =0 and lim P = / strongly. 


t—-—o to 


The first condition means that if H, and H, are the subspaces on which 
P, and P, project H, then H, c H, and P, projects H, on H,. The second 
means that for each vector v € H we have 


lim Pv =0 and lim Pv = v. 


t7>—©o too 


In Chapter XIX we defined such a family for a bounded hermitian 
operator. In this case, we note that P,=0O for a large negative, and 
P,=1 for b large positive. A spectral family satisfying this additional 
condition is called limited. 

Observe also that the spectral family associated with a bounded opera- 
tor is continuous on the right by Theorem 1.5 of Chapter XIX. How- 
ever, we do not assume right-continuity in our general definition of a 
spectral family. The spectral family associated with a bounded operator 
A was defined as follows. For each be R we let w, be the function as 
shown in Figure 20.1. 


Figure 20.1 


Then P = wW,(A) defines the spectral family associated with A. But we 
have seen in the first section of this chapter how to make sense of f(A) 
when f is bounded measurable and A is a self-adjoint operator, not 
necessarily bounded. This allows us to get: 
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Theorem 4.1. Let A be a self-adjoint operator on H. Let P,=W,(A). 
Then {P,‘ is a spectral family, strongly continuous on the right. 


Proof. As before, we shall obtain an expression for P in terms of a 
direct sum decomposition as in Theorem 2.6 and 2.7 of Chapter XIX. 
Suppose that 


H=@QH, and A=QA,, 


where each A, = AH, is bounded hermitian. Let {P} be the spectral 
family of A, on H,, so P™ =w,(A,). Then by §1, we get 


P=> PR 


The first condition SF 1 is obviously satisfied. For the second, fix 
v=) v,. Then 
»y P™v _ y P™»,. 


Select N so large that 
y |v,|7 < 


n>N 


Then let t— —oo to get the first limit. For too consider v — Pv. 
Finally, we want to prove continuity from the right, i. for ve H we 
want to show 

lim (Fi45 — Fv = 0. 

50 


We look at 
2 


(PS — PM )e, 
Again take N so large that ) |v,|* < ¢. We can then find 6 so small that 
N 
> (LAS — RB Jo, 1? <, 


thus getting our continuity and proving the theorem. 


XX, §5. THE SPECTRAL INTEGRAL AS 
STIELTJES INTEGRAL 


In Chapter IX, §7 we defined the Stieltjes integral with respect to an 
increasing real valued function. Such a function arises naturally from a 
spectral family, as follows. If h is an increasing function, we again let dh 
be the associated Stieltjes functional on C,(R). 

Let {P} be a spectral family, not necessarily associated with an opera- 
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tor. Let ve H and let h=hp,, be the function 
h(t) = <A, v>. 
Then h is positive, increasing, bounded by 1. 


Theorem 5.1. Let A be a self-adjoint operator, and let {P,} be the 
associated spectral family. Let h(t)=<Pv,v> as above. Then for any 
function @ in C,(R), we have 


[odr= | odu, 


where pL, is the spectral measure associated with A and v. 
Proof. We know from §4 that 


R= W,(A). 


For a partition T of sufficiently small size, the integral 


| go dh 
is approximated by a sum 
Y (cx) <(P,., — Pv, ¥> = Yi O(n) < (Wa. (A) — W,(A))2 o> 


= x P(CL)(W,., -_ W,,) d,. 


But 
» PC.) (Vr, ~ W,,) 


is an ordinary Riemann sum for @, uniformly close to 9 if the partition 
has sufficiently small size. By the dominated convergence theorem with 
respect to the measure yu, this last expression is therefore uniformly close 


to 
le dy,, 


thus proving the theorem. 


XX, §6. EXERCISES 


Instead of starting with a self-adjoint operator as in the text, one may start with 
a spectral family, develop the functional calculus, and get back (unbounded) 
operators as follows. 
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1. 


4. 


Let {P} be a spectral family, and let ve H. Let 
h(t) = (Fv, v>. 
Show that there is a positive functional 4,, bounded by 1, such that 


1,() = lim ¥) p(c,) <(F,,., — F,)2 > 
= lim S(T, c, @) 
Tc 


the limit being taken in the same sense as in the text, for the size of the 
partition tending to 0. Deduce the existence and uniqueness of a measure p, 
such that 


A(Q) = [e du. 
In a similar way, obtain the complex measure yp, ,, such that 


Ay,w(P) = ay d, (cy) <(Pi,., — F,,)¥ w>- 


| pa eee 
R 


Show that 


= I[fll.olvllwl.- 


. Conclude that there exists a unique bounded operator f(P) for each f € BM(R) 


such that 


|, Ff dp, y = <f(P)v, w>. 


. Show that the map f+ f(P) is a linear map from BM(R) into the space of 


operators, satisfying the five properties SPEC 1 through SPEC 5, except that 
A is replaced by P. 


Theorem. Let {P} be a spectral family. Let f be a real Borel measurable 
function on R. Then there exists a unique self-adjoint operator f(P) such that: 
(1) The domain of f(P) consists of those v € H such that 


fe L(u,). 
(ii) < f(P)v, v> = | f du, for all v € Domain of f(P). 
(iti) | f(P)v|? = J f? du,. 


[Hint: Follow step by step the proof given for the analogous theorem in the 
text, concerning f(A) when we start with a self-adjoint operator A. ] 


Right continuity played no role in the above results. It is important only 
for uniqueness purposes, as shown in the next result. 


. Theorem. Let {P,}, {Q,} be spectral families, which are strongly continuous from 


the right, and such that they induce the same functional on CR). Then P, = Q, 
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for all t. If b is a real number and w, is the function whose graph is drawn 
below, then 


v,(P) = Py. 
l 
| 


| if +<b 
t)= = 
volt) (4 otherwise 


Proof. From the assumption it follows that if @ e C,(R), then (Ff) = o(Q,) 
for all t. Let {g,} be a sequence decreasing to w, as shown. 


8n(t) 


Then for a fine partition, 


(Pu,oy= DY CR, — B® > 


te+1S 


{ du, Ss CPyitnds v». 


lA 


Since 


{o du, > |v dy, = Cw, (P)e, v»> 


as n> 00, we get B, = W,(P), thereby proving the theorem. 


PART SIX 


Global Analysis 


One of the most attractive things that can be done with analysis is to 
mix it up with the global topology of geometric structures. For instance, 
whereas the local existence theorem for differential equations yields inte- 
gral curves in an open set of say euclidean space, one may wish to see 
what happens if a differential equation is given on the sphere. In this 
case, the integral curves wind around the sphere and one investigates their 
behavior as time goes to infinity. Similarly, one can work on toruses, or 
arbitrarily complicated similar structures, which have one thing in com- 
mon: locally, they look like euclidean space, but globally they turn and 
twist. The relations between the analytic properties, and the algebraic- 
topological invariants associated with the topological structure, constitute 
one of the central parts of mathematics. Our task here is but to lay 
down the most basic definitions to prepare readers for further readings, 
and to give them the flavor of global results, as distinguished from local 
ones in open sets of euclidean space. 

We should add, however, that even on open sets of euclidean space, 
1.e. locally, we may be interested in certain objects and properties which 
are invariant under C?’ changes of coordinate systems, i.e. under C? iso- 
morphisms. The language of manifolds provides the natural language for 
such properties. Thus we begin with the change of variables formula, 
which gives an example how the integral changes under C' isomor- 
phisms. The change is of such a nature that we can associate with it an 
integral on manifolds. This is done in the last chapter, which includes 
the basic theorem of Stokes. 

We don’t do too much with differential equations besides defining the 
basic notions on manifolds. Readers can refer to {[La2] for further 
foundations. Smale’s survey [Sm 2] is an excellent starting point for the 
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global analysis of ordinary differential equations. As for partial differen- 
tial equations, Nirenberg’s exposition of certain basic results in [Pr] gives 
an exceptionally attractive introduction for this part of global analysis. 
In fact, the whole proceedings [Pr] are highly recommended. 


CHAPTER XxXI 


Local Integration of 
Differential Forms 


Throughout this chapter, is Lebesgue measure on R". 
If A is a subset of R", we write L(A) instead of L(A, pu, C). 


XXI, §1. SETS OF MEASURE 0 


We recall that a set has measure 0 in R” if and only if, given ¢, there 
exists a covering of the set by a sequence of rectangles {R,} such that 
J u(R;) < «. We denote by R; the closed rectangles, and we may always 
assume that the interiors R? = Int(R,) cover the set, at the cost of in- 
creasing the lengths of the sides of our rectangles very slightly (an ¢/2” 
argument). We shall prove here some criteria for a set to have measure 
0. We leave it to the reader to verify that instead of rectangles, we could 
have used cubes in our characterization of a set of a measure 0 (a cube 
being a rectangle all of whose sides have the same length). 

We recall that a map f satisfies a Lipschitz condition on a set A if 
there exists a number C such that 


| f(x) — f(y)| S$ Clx — y| 


for all x, ye A. Any C* map f satisfies locally at each point a Lipschitz 
condition, because its derivative is bounded in a neighborhood of each 
point, and we can then use the mean value estimate, 


f(x) — Fy) S 1x — yl supl #2), 


the sup being taken for z on the segment between x and y. We can take 
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the neighborhood of the point to be a ball, say, so that the segment 
between any two points is contained in the neighborhood. 


Lemma 1.1. Let A have measure 0 in R" and let f: AR" satisfy a 
Lipschitz condition. Then f(A) has measure 0. 


Proof. Let C be a Lipschitz constant for f. Let {R,} be a sequence of 
cubes covering A such that ) u(R,)<e. Let r; be the length of the side 
of R;. Then for each j we see that f(A R;) is contained in a cube R; 
whose sides have length < 2Cr;. Hence 


U(Rj) S 2°C"r? = 2"°C"y(R;). 
Our lemma follows. 


Lemma 1.2. Let U be open in R" and let f: U->R" be a C‘ map. Let 
Z be a set of measure 0 in U. Then f(Z) has measure 0. 


Proof. For each x € U there exists a rectangle R,, contained in U such 
that the family {R°} of interiors covers Z. Since U is separable, there 
exists a denumerable subfamily covering Z, say {R,;}. It suffices to prove 
that f(ZR,) has measure 0 for each j. But f satisfies a Lipschitz 
condition on R, since R,; is compact and f’ is bounded on R,, being 
continuous. Our lemma follows from Lemma 1.1. 


Lemma 1.3. Let A be a subset of R™. Assume that m<n. Let 
f: AR" satisfy a Lipschitz condition. Then f(A) has measure 0. 


Proof. We view R™ as embedded in R" on the space of the first m co- 
ordinates. Then R™ has measure 0 in R", so that A has also n-dimensional 
measure 0. Lemma 1.3 is therefore a consequence of Lemma 1.1. 


Note. All three lemmas may be viewed as stating that certain parame- 
trized sets have measure 0. Lemma 1.3 shows that parametrizing a set by 
strictly lower dimensional spaces always yields an image having measure 
0. The other two lemmas deal with a map from one space into another 
of the same dimension. Observe that Lemma 1.3 would be false if f is 
only assumed to be continuous (Peano curves). 


XXI, §2. CHANGE OF VARIABLES FORMULA 


We first deal with the simplest of cases. We consider vectors v,, ...,v, in 
R" and we define the block B spanned by these vectors to be the set of 
points 

tp0p toh + tyv, 
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with O<t;< 1. We say that the block is degenerate (in R") if the vectors 
Vi, ...,v, are linearly dependent. Otherwise, we say that the block is 
non-degenerate, or is a proper block in R”. 


We see that a block in R? is nothing but a parallelogram, and a block in 
R° is nothing but a parallelepiped (when not degenerate). 
We shall sometimes use the word volume instead of measure when 
applied to blocks or their images under maps, for the sake of geometry. 
We denote by Vol(v,,...,v,) the volume of the block B spanned by 
Vi, -+--5U,. We define the oriented volume 


Vol°(v, s° ..5U,) — + Vol(v,, ae Vn)s 


taking the + if Det(v,,...,v,) > 0 and the — if Det(v,,...,v,) <0. The 
determinant is viewed as the determinant of the matrix whose column 
vectors are v,,...,U,, in that order. 

We recall the following characterization of determinants. Suppose that 
we have a product 


(U,,...,0,)->0, AU, A° ADU, 


which to each n-tuple of vectors associates a number such that the prod- 
uct is multilinear, alternating, and such that 


eA Ae, =1 
if e,, ...,e, are the unit vectors. Then this product is necessarily the 
determinant, i.e. it is uniquely determined. “Alternating” means that if 
v; = v, for some i # j then 


Vv, Ac Av, = 90. 


The uniqueness is easily proved, and we recall this short proof. We can 
write 


U; = Qii 4 + — + Aine yn 
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for suitable numbers a,;, and then 


ij? 
Di Att A Ug = (04101 Hott H Ayn On) A A (Qn ey °° + Ann en) 


= » G1 6(1)€a(1) A °° A An a(n) & a(n) 
(ex 


= » Q1,6(1) °° An,ony€a(ty) A °°" A Sain): 
Co 


The sum is taken over all maps o: {1,...,n}—> {1,...,n}, but because of 
the alternating property, whenever o is not a permutation the term corre- 
sponding to o is equal to 0. Hence the sum may be taken only over all 
permutations. Since 


Cat) N° A Cony = E(O)ey Aw A 


where e(c) = 1 or —1 is a sign depending only on og, it follows that the 
alternating product is completely determined by its value e, A-:: A e,, 
and in particular is the determinant if this value is equal to 1. 


Theorem 2.1. We have 


Vol°(v,,...,0,) = Det(v,,...,v,) 
and 
Vol(v,,...,v0,) = |Det(v,,...,v,)]. 


Proof. If v,, ...,v, are linearly dependent, then the determinant is 
equal to 0, and the volume is also equal to 0, for instance by Lemma 1.3. 
So our formula holds in this case. It is clear that 


Vol%(e,,...,€,) = 1. 


To show that Vol® satisfies the characteristic properties of the determi- 
nant, all we have to do now is to show that it is linear in each variable, 
say the first. In other words, we must prove 


(*) Vol°(cv, v5, ...,0,) = c Vol (v, v,,...,0,)  forceR, 


(xx) Vol®(v + w, v5,...,0,) = Vol(v, v2, ...,0,) + Vol°(w, v2, ...,0,)- 


As to the first assertion, suppose first that c is some positive integer k. 
Let B be the block spanned by v, v,, ...,v,. We may assume without 
loss of generality that v, v,, ...,v, are linearly independent (otherwise, the 
relation is obviously true, both sides being equal to 0). We verify at once 
from the definition that if B(v, v,,...,v,) denotes the block spanned by 
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V, V2, ...,v, then B(kv, v,,...,v,,) is the union of the two sets 
B((k — 1)v, v5, ...,0,) and Biv, v2, ...,0,) + (k — Iv, 


which have only a set of measure 0 in common, as one verifies at once 
from the definitions. 


Therefore, we find that 


Vol(kv, v2, ...,0,) = Vol((k — 1)v, v2, ...,0,) + Vol(v, v2, ...,0,) 
= (k — 1)Vol(v, v5, ...,v,) + Vol(v, v2,...,0,) 
= k Vol(v, v2, ...,U,), 


as was to be shown. 
Now let 
v=v,/k 


for a positive integer k. Then applying what we have just proved shows 
that 


1 1 
vol( to. D>, sty = , Voll, wa e5Dn)s 


Writing a positive rational number in the form m/k = m-1/k, we conclude 
that the first relation holds when c is a positive rational number. If r is 
a positive real number, we find positive rational numbers c, c’ such that 
csrsc’. Since 


B(cv, v2,...,0,) € B(rv, v2, ...,0,) € B(c’v, v2, ...,U_), 
we conclude that 


c Vol(v, v,,...,v,) S Vol(rv, v,,...,v,) Sc’ Vol(v, vz, ...,v,,). 
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Letting c, c’ approach r as a limit, we conclude that for any real number 
r = 0 we have 


Vol(rv, v,,...,0,) =r Vol(v, v2, ...,0,)- 
Finally, we note that B(—v, v,,...,v,) is the translation of 
B(v, V2, ..-5V_) 
by —v so that these two blocks have the same volume. This proves the 
first assertion. 


As for the second, we look at the geometry of the situation, which is 
made clear by the following picture in case v = v,, W = 0p. 


The block spanned by v,, v2, ... consists of two “triangles” T; T’ having 
only a set of measure zero in common. The block spanned by v, + v2 
and v, consists of T’ and the translation T+ v,. It follows that these 
two blocks have the same volume. We conclude that for any number c, 


Vol9(v, + C02, 025 .++50_) = Vol?(v,, 02, ..-50,). 
Indeed, if c = 0 this is obvious, and if c 4 0 then 


c Vol9(v, + C05, 035 .-+50_) = Vol?(v, + Cv2, CV, ..-5U_) 
= Vol°(v,, CV2, ..+5V_) = C Vol?(v,, 02, ..-50,). 
We can then cancel c to get our conclusion. 

To prove the linearity of Vol® with respect to its first variable, we may 
assume that v,, ...,v, are linearly independent, otherwise both sides of 
(x*) are equal to 0. Let v, be so chosen that {v,,...,v,} is a basis of R”. 
Then by induction, and what has been proved above, 

Vol? (e104, Hot H CgUys V2.5 «++ 5Vq) = VOl?(cy 04 + 27+ + Cy—1U_—1» V2.5 +++ 5p) 
= Vol9(c, 01, 025 .-+5Un) 


= c, Vol°(v,,...,0,). 


From this the linearity follows at once, and the theorem is proved. 
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Corollary 2.2. Let S be the unit cube spanned by the unit vectors in R". 
Let 4: R" — R" be a linear map. Then 


Vol A(S) = |Det(A)|. 


Proof. If v,, ...,v, are the images of e,, ...,e, under A, then A(S) is the 
block spanned by v,, ...,v,. If we represent 4 by the matrix A = (q,,), 
then 


U; = A,;@4 + “ee + aAnien 
and hence Det(v,,...,v,) = Det(A) = Det(A). This proves the corollary. 


Corollary 2.3. If R is any rectangle in R" and 4:R"—R" is a linear 
map, then 


Vol A(R) = |Det()| Vol(R). 


Proof. After a translation, we can assume that the rectangle is a 
block. If R = 4,(S) where S is the unit cube, then 


A(R) = 40 A,(S) 
whence by Corollary 2.2, 
Vol A(R) = |Det(A o A,)| = |Det(A) Det(A,)| = |Det(A)| Vol(R). 

The next theorem extends Corollary 2.3 to the more general case 
where the linear map / is replaced by an arbitrary C'-invertible map. 
The proof then consists of replacing the C' map by its derivative and 
estimating the error thus introduced. For this purpose, we define the 
Jacobian determinant 


A,(x) = Det J,(x) = Det f’(x) 


where J,(x) is the Jacobian matrix, and f’(x) is the derivative of the map 
f:U->R’". 


Theorem 2.4. Let R be a rectangle in R", contained in some open set U. 
Let f: U >R" be a C' map, which is C'-invertible on U. Then 


u(f(R)) = {. JA,| dy. 


Proof. When f is linear, this is nothing but Corollary 2.3 of the 
preceding theorem. We shall prove the general case by approximating f 
by its derivative. Let us first assume that R is a cube for simplicity. 
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Given «, let P be a partition of R, obtained by dividing each side of R 
into N equal segments for large N. Then R is partitioned into N" 
subcubes which we denote by S, (j = 1,...,N"). We let a; be the center 
of S;. 
We have 
Vol f(R) = ¥; Vol f(S)) 
J 


because the images f(S;) have only sets of measure 0 in common. We 
investigate f(S;) for each j. The derivative f’ is uniformly continuous on 
R. Given ¢, we assume that N has been taken so large that for x eS; we 
have 


F(x) = f(a) + A(x — aj) + p(x — 4) 
where A; = f’(a;) and 


|p(x — a;)| S |x — ajle. 


To determine Vol f(S,) we must therefore investigate f(S) where S is a 
cube centered at the origin, and f has the form 


f(x) = 4x + g(x), — | @(x)| S |xle 


on the cube S. (We have made suitable translations which don’t affect 
volumes.) We have 


Aho f(x) =x +4 o G(x), 
so that A710 f is nearly the identity map. For some constant C, we have 
for xeS: 


|A~* 0 p(x)| S Ce. 


From the lemma after the proof of the inverse mapping theorem, we 
conclude that 17! o f(S) contains a cube of radius 


(1 — Ce)(radius S) 
and trivial estimates show that 47’ o f(S) is contained in a cube of radius 
(1 + Ce)(radius S). 


We apply 4 to these cubes, and determine their volumes. Putting indices 
j on everything, we find that 


\Det f’(a;)| Vol(S,) — eC, Vol(S)) 
< Vol f(S,) < |Det f’(a;)| Vol(S,) + eC, Vol(S;) 
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with some fixed constant C,. Summing over j and estimating |A,|, we 
see that our theorem follows at once, in case R is a cube. 


Remark. We assumed for simplicity that R was a cube. Actually, by 
changing the norm on each side, multiplying by a suitable constant, and 
taking the sup of the adjusted norms, we see that this involves no loss of 
generality. Alternatively, we can approximate a given rectangle by cubes. 


Corollary 2.5. If g is a continuous function on f(R), then 


| au = | (go f)|A,| dy. 
J(R) R 


Proof. The functions g and (go f)|A,| are uniformly continuous on 
f(R) and R respectively. Let us take a partition of R and let {S,} be the 
subrectangles of this partition. If 6 is the maximum length of the sides of 
the subrectangles of the partition, then f(S;) is contained in a rectangle 
whose sides have length < Co for some constant C. We have 


| adu =| g du. 
f(R) J J f(S;) 


We may assume g real. The sup and inf of g on f(S,) differ only by ¢ if 0 
is taken sufficiently small. Using the theorem, applied to each S,, and 
replacing g by its minimum m, and maximum M, on S, we see that the 
corollary follows at once. 


Theorem 2.6 (Change of Variables Formula). Let U be open in R" 
and let f: U->R" be a C! map, which is C’ invertible on U. Let 
ge L1(f(U)). Then (g o f)|A,| is in £*(U) and we have 


| au = | (go f)|A,| dy. 
f(U) U 


Proof. Let R be a closed rectangle contained in U. We shall first 
prove that the restriction of (go f)|A,| to R is in ¥*(R), and that the 
formula holds when U is replaced by R. We know that C,(f{(U)) is 
L}-dense in #'(f(U)) by Theorem 3.1 of Chapter IX. Hence there exists 
a sequence {g,} in C,( f(U)) which is L'-convergent to g. Using Theorem 
5.2 of Chapter VI, we may assume that {g,} converges pointwise to g 
except on a set Z of measure 0 in f(U). By Lemma 1.2, we know that 
f~*(Z) has measure 0. 

Let gf =(g,°f)|A,|. Each function gf is continuous on R. The se- 
quence {g} converges almost everywhere to (go f)|A,| restricted to R. 
It is in fact an L'-Cauchy sequence in #'(R). To see this, we have by 
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the result for rectangles and continuous functions (corollary of the pre- 
ceding theorem): 


| it~ a2 du = | Ik — Im\ 4p, 
R J(R) 


so the Cauchy nature of the sequence {g*} is clear from that of {g,}. It 
follows that the restriction of (go f)|A,| to R is the L'-limit of {g*}, and 
is in #1(R). It also follows that the formula of the theorem holds for R, 


that is 
| a du = | (go f)|Ay| du 
S(A) A 
when A = R. 


The theorem is now seen to hold for any measurable subset A of R, 
since f(A) is measurable, and since a function g in #'( f(A)) can be 
extended to a function in #1(f(R)) by giving it the value 0 outside f(A). 
From this it follows that the theorem holds if A is a finite union of 
rectangles contained in U. We can find a sequence of rectangles {R,,} 
contained in U whose union is equal to U, because U is separable. 
Taking the usual stepwise complementation, we can find a disjoint se- 
quence of measurable sets 


Am = Rm — (Ry U7 U Rn-1) 


m 


whose union is U, and such that our theorem holds if A= A,,. Let 


Am = Dr A,,) — IXf(Am) and hn = (Nn © f)\Ayl. 


Then > h,, converges to g and ) h* converges to (go f)|A,|. Our theo- 
rem follows from Corollary 5.13 of the dominated convergence theorem, 
Chapter VI. 


Note. In dealing with polar coordinates or the like, one sometimes 
meets a map f which is invertible except on a set of measure 0. It is 
now trivial to recover a result covering this type of situation. 


Corollary 2.7. Let U be open in R" and let f: UR" be a C* map. 
Let A be a measurable subset of U such that the boundary of A has 
measure 0, and such that f is C' invertible on the interior of A. Let g 
be in £1(f(A)). Then (go f)\A,| is in Z*(A) and 


| g du = | (go f)\A,| du. 
(A) A 


Proof. Let U, be the interior of A. The sets f(A) and f(U)) differ 
only by a set of measure 0, namely f(0A). Also the sets A, Up differ only 


[XXI, §3] DIFFERENTIAL FORMS 507 


by a set of measure 0. Consequently we can replace the domains of 
integration f(A) and A by f(U,)) and Up, respectively. The theorem 
applies to conclude the proof of the corollary. 


Note. Since step maps are dense in #'(X, E) for a Banach space E, 
the preceding proof generalizes at once to the case of Banach valued 
maps. 


The change of variables formula depends on a C' isomorphism 
f:U—V between open sets of n-space. It suggests that one should 
define some object which changes by multiplication of the Jacobian (or 
its absolute value) under such an isomorphism, and this is what we shall 
do in the next section, by defining differential forms. After that, we intro- 
duce a language, that of manifolds, which allows us to speak invariantly 
about these objects. 


XXI, §3. DIFFERENTIAL FORMS 


We recall first two simple results from linear (or rather multilinear) alge- 
bra. We use the notation E™ = E x E x::: x E, r times. 


Theorem A. Let E be a finite dimensional vector space over the reals of 
dimension n. For each positive integer r with 1<r<n there exists a 
vector space /\" E and a multilinear alternating map 


EX + NE 
denoted by (u,,...,U,) eu, A+++ A Uu,, having the following property. If 
{v,,...,0,} is a basis of E, then the elements 
{u;, Av A U;}, i, <i, <°'' <i,, 


form a basis of /\' E. 


We recall that alternating means that u, A°:-Au,=0 if u; =u; for 
some i # j. We call /\' E the r-th alternating product (or exterior prod- 
uct) of E. If r=0, we define \° E=R. Elements of /\’ E which can be 
written in the form u, A --: A u, are called decomposable. Such elements 
generate /\' E. If r > dim E, we define /\’ E = {0}. 


Theorem B. For each pair of positive integers (r,s), there exists a 
unique product (bilinear map) 


NEXx MESNTE 
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such that if u,,...,U,, Wy, ...,W,€ E then 
(u, Acc AU,) X (Wy ATO AW) FO Uy ATO AU, A Wy Act A Wy. 
This product is associative. 


The proofs for these two statements will be briefly summarized in the 
appendix to this chapter. 


Let E* be the dual space, E* = L(E, R). (We prefer here to use E* 
rather than E’, first because we shall use the prime for the derivative, and 
second because we want a certain notational consistency as in §4.) If 


E=R" and 4,, ...,A, are the coordinate functions, then each J; is an 
element of the dual space, and in fact {/,,...,A,} is a basis of this dual 
space. 


Let U be an open set in R". By a differential form of degree r on U 
(or an r-form) we mean a map 


aw: U > /\' E* 


from U into the r-th alternating product of E*. We say that the form is 
of class C? if the map is of class C?. (We view /\’ E* as a normed vector 
space, using any norm. It does not matter which, since all norms on a 
finite dimensional vector space are equivalent.) 

Since {/,,...,4,} is a basis of E*, we can express each differential 
form in terms of its coordinate functions with respect to the basis 


Ai, AAAS (i <<), 
namely for each x € U we have 


(x) =) fi, OA, Av A Ai, 
(i) 


where f(;) = f;,...;, is a function on U. Each such function has the same 
order of differentiability as w. We call the preceding expression the 
standard form of w. We say that a form is decomposable if it can be 
written as just one term f(x)A; A+*: A 4;.. Every differential form is a 
sum of decomposable ones. 

We agree to the convention that functions are differential forms of 
degree 0. 

It is clear that the differential forms of given degree r form a vector 
space, denoted by ’(U). 

Let E=R"”. Let f be a function on U. For each x € U the derivative 


f'(x): R° > R 
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is a linear map, and thus an element of the dual space. Thus 
f':U > E* 


is a differential form of degree 1, which is usually denoted by df. If f is 
of class C?’, then df is class C?™. 
Let A; be the i-th coordinate function. Then we know that 


dA;(x) = Aj(x) = 4; 


for each xe U because A’(x) =A for any continuous linear map 4. 
Whenever {x,,...,x,} are used systematically for the coordinates of a 
point in R", it is customary in the literature to use the notation 


di;(x) = dx; . 


This is slightly incorrect, but is useful in formal computations. We 
shall also use it in this book on occasions. Similarly, we also write 
(incorrectly) 
w=) fi Ax, A011 A dX; 
(i) 


instead of the correct 
w(x) = 2, fi, ANA. 


In terms of coordinates, the map df (or f') is given by 


df(x) = f'(x) = Dif(x)a, +++ + Dif) An 


where D,f(x) = 0f/dx; is the i-th partial derivative. This is simply a 
restatement of the fact that if h = (h,,...,h,) 1s a vector, then 


yp — Of A 
f'(xyh= ax, eee ax," 
Thus in old notation, we have 
6) 
df(x) = SF ay. +t + u dx,,- 
OX, OX, 


Let w and w be forms of degrees r and s respectively, on the open set 
U. For each x e U we can then take the alternating product w(x) A w(x) 
and we define the alternating product w ~ w by 


(wo A W)(x) = w(x) A W(X). 
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If f is a differential form of degree 0, that is a function, then we define 
frAao= fo 
where (fw)(x) = f(x)w(x). By definition, we then have 
on fl =fony. 


We shall now define the exterior derivative dw for any differential form 
w. We have already done it for functions. We shall do it in general first 
in terms of coordinates, and then show that there is a characterization 
independent of these coordinates. If 


(60) =) fi di, N***A dh; 
(i) 
we define 


dw = y afi) A di,;, Nw" A di; . 
(i) 


Example. Suppose n = 2 and w is a 1-form, given in terms of the two 
coordinates (x, y) by 


co(x, y) = f(x, y) dx + g(x, y) dy. 
Then 
dw(x, y) = df(x, y) A dx + dg(x, y) A dy 


= (Fax + Lay) 0 d+ (Sax + ay) m dy 


Ox oy Ox oy 
=F ay pdx + ax xn dy 
Oy Ox 
of og 
(7 — ) dy A dx 


because the terms involving dx A dx and dy ~ dy are equal to 0. 
Theorem 3.1. The map d is linear, and satisfies 
dia vA W)=danw+(—lloan db 


if r = deg aw. The map d is uniquely determined by these properties, and 
by the fact that for a function f, we have df = f'’. 


Proof. The linearity of d is obvious. Hence it suffices to prove the 
formula for decomposable forms. We note that for any function f we 
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have 
d(fw)=df nw+ f do. 


Indeed, if w is a function g, then from the derivative of a product we get 
d(fg)=fdg+gdf. If 


w=gdi, rn: A di, 
where g is a function, then 


d( fw) = d(fg di, N ++: A dd,) = d(fg) A dd;, N°°* A da, 
=(fdg+gdf)n di, an: a di, 
=fdo+df aa, 


as desired. Now suppose that 
o=f dds; vn: A di; and w=gdi; rn: A dd; 
= fo = gi) 


with i, <-::<i, and j, <-::<j, as usual. If some i, = j,, then from 
the definitions we see that the expressions on both sides of the equality 
in the theorem are equal to 0. Hence we may assume that the sets 
of indices i,, ...,i, and j,, ...,j, have no element in common. Then 
d(@ 0 W) = 0 by definition, and 


do nb) = d( fg 0b) =d(fg) na day 
=(gdf+fdg)rnarayp 
=danw+fdgndaw 
=dw nw +(—1)!f6 a dg aw 
=dwo vnw+(—-l)'o a dy, 


thus proving the desired formula, in the present case. (We used the fact 
that dg A @ =(—1)'@ ~A dg, whose proof is left to the reader.) The for- 
mula in the general case follows because any differential form can be 
expressed as a sum of forms of the type just considered, and one can 
then use the bilinearity of the product. Finally, d is uniquely determined 
by the formula, and its effect on functions, because any differential form 
is a sum of forms of type fdd,;, A -:: A dd, and the formula gives an 
expression of d in terms of its effect on forms of lower degree. By 
induction, if the value of d on functions is known, its value can then be 
determined on forms of degree 2 1. This proves the theorem. 
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Corollary 3.2. Let w be a form of class C*. Then ddw = 0. 


Proof. If f is a function, then 


and 


Using the fact that the partials commute, and the fact that for any two 
positive integers r, s we have dx, A dx, = —dx, A dx,, we see that the 
preceding double sum is equal to 0. A similar argument shows that the 
theorem is true for 1-forms of type g(x) dx; where g is a function, and 
thus for all 1-forms by linearity. We proceed by induction. It suffices to 
prove the formula in general for decomposable forms. Let @ be decom- 
posable of degree r, and write 


o=naraw 


where deg y = 1. Using the formula of Theorem 3.1 twice, and the fact 
that dd = 0 and ddy = 0 by induction, we see at once that ddw = 0, as 
was to be shown. 


XXI, §4. INVERSE IMAGE OF A FORM 


We start with some algebra once more. Let E, F be finite dimensional 
vector spaces over R and let 4: E-— F be a linear map. If uw: F>R is an 
element of F*, then we may form the composite linear map 


poh: E>R 
which we visualize as 
ESFAR. 
We denote this composite pod by A*(u). It is an element of E*. We 


have a similar definition on the higher alternating products, and in the 
appendix, we shall prove: 


Theorem C. Let 4: E-—F be a linear map. For each r there exists a 
unique linear map 


as: Nv F* > /\" E* 
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having the following properties: 


(i) 2*(@ A W) = A*(o) a AAW) for we \' F*, We (8 F*. 
(ii) If we F* then A*(u) = wo A, and A* is the identity on /\° F* =R. 


Remark. If y;,, ...,4; are in F*, then from the two properties of 
Theorem C, we conclude that 


A¥(Uj, Av A By) = (My, OAD AA (uj, ° A). 


Now we can apply this to differential forms. Let U be open in E = R" 
and let V be open in F=R™. Let f: U-V be a C? map, p21. For 
each x € U we obtain the linear map 


f(x): E> F 


to which we can apply the preceding discussion. Consequently, we can 
reformulate Theorem C for differential forms as follows: 


Theorem 4.1. Let f: UV be a C? map, p21. Then for each r there 
exists a unique linear map 


f*: Q"(V) > AU) 


having the following properties: 


(i) For any differential forms w, W on V we have 
f¥(o A Wb) = f*(@) A f*(). 

(ii) If g is a function on V then f*(g) = 90 f, and if w is a 1-form then 
(f*c)(x) = oof f(x)) © df (x). 


We apply Theorem C to get Theorem 4.1 simply by letting 4 = f’(x) 
at a given point x, and we define 


(f*a)(x) = f'(x)*o(f()). 
Then Theorem 4.1 is nothing but Theorem C applied at each point x. 


Example 1. Let y,, ...,y, be the coordinates on V, and let p; be the 
j-th coordinate function, j = 1, ...,m, so that y; = uj(y1,.--,¥m). Let 


f:U~V 
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be the map with coordinate functions 


yi= f(x) = pH; ° f(x). 
If 


o(y) = g(y) dy, A+++ A dy,. 


is a differential form on V, then 
f*o=(g°of)df, vA af. 


Indeed, we have for x € U: 


(f*c)(x) = g(f(x)) (Hj, oF’) A A (Uy, °F) 
and 
Fi (x) = (uy 0 FY (x) = Hy 0 (x) = dfj(x). 
Example 2. Let f:[a,b]—R* be a map from an interval into the 


plane, and let x, y be the coordinates of the plane. Let t be the coordi- 
nate in [a,b]. A differential form in the plane can be written in the form 


co(x, y) = g(x, y) dx + h(x, y) dy 


where g, h are functions. Then by definition, 


d d 
Sreo(t) = g(x(0), WO) dt + h(x(0), yl) = dt 


if we write f(t) = (x(t), y(t)). Let G=(g,h) be the vector field whose 
components are g and h. Then we can write 


f*ol(t) = GF) f'(O at, 


which is essentially the expression which is integrated when defining the 
integral of a vector field along a curve. 


Example 3. Let U, V be both open sets in n-space, and let f: U>V 
be a C? map. If 


oly) = gly) dy, Av: A dy, 
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where y; = f,(x) is the j-th coordinate of y, then 


dy; = D, f(x) dx, +++: + D, f(x) dx,, 


dy; 
~ -Jq ee 
ax, Xytort 


oy; 
d 
OX, *n 


and consequently, expanding out the alternating product according to the 
usual multilinear and alternating rules, we find that 


f*¥co(x) = g(f(x))As(X) dx, A+ A dx,. 
As in §2, A, is the determinant of the Jacobian matrix of f. 


Theorem 4.2. Let f: UV and g: VW be C” maps of open sets. If 
w is a differential form on W, then 


(go f)*(@) = f*(g*(o)). 
Proof. This is an immediate consequence of the definitions. 


Theorem 4.3. Let f: U—V be a C* map and let w be a differential 
form of class C‘ on V. Then 


f*(dw) = df*a. 
In particular, if g is a function on V, then 
F*(dg) = d(g° f). 


Proof. We first prove this last relation. From the definitions, we have 
dg(y) = g'(y), whence by the chain rule, 


(£*(dg))(x) = 9'(F0)) © F(X) = (9 2 SY) 


and this last term is nothing else but d(g o f)(x), whence the last relation 
follows. For a form of degree 1, say 


oly) = g(y) ay, 


with y, = f,(x), we find 


(f* doo)(x) = (g'(F(%) © f'0) A afi 09). 
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Using the fact that ddf, = 0, together with Theorem 3.1, we get 


(df *co)(x) = (d(g 0 f))(x) A df, 0), 


which is equal to the preceding expression. Any 1-form can be expressed 
as a linear combination of forms, g; dy;, so that our assertion is proved 
for forms of degree 1. 

The general formula can now be proved by induction. Using the 
linearity of f*, we may assume that @ is expressed as w = wW A y where 
Ww, n have lower degree. We apply Theorem 3.1, and (1) of Theorem 4.1 
to 


J* dao = f*(dp an) +(—1f*(h A dn) 


and we see at once that this is equal to df*mw, because by induction, 
f* dy =df*w and f* dy =df*n. This proves the theorem. 


Let U be open in n-space, and let w be a continuous differential form 
on U of degree n. We can associate a positive measure with w as 
follows. Let us write 


w(x) = h(x) dx, A ++: A dx,. 


If ge CU), we define |w| by 
<g, |@|> -| g(x)|h(x)| dx, +++ dx, -| g\h| du. 
U U 


Then gt <g, |@|> is a positive functional on C,(U). We know that there 
exists a unique regular measure associated with this functional, and we 
shall call this measure the measure on U associated with |w|. We may 
denote it by y,,,. It is characterized by the relation 


<g, |a|> = | J A py.» 


We shall analyze this measure more closely in Chapter XXIII. 


XXI, §5. APPENDIX 


We shall give brief reviews of the proofs of the algebraic theorems which 
have been quoted in this chapter. 

We first discuss “formal linear combinations”. Let S be a set. We 
wish to define what we mean by expressions 


C18; +''' + C,S, 
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where {c;} are numbers, and {s;} are distinct elements of S. What do we 
wish such a “sum” to be like? Well, we wish it to be entirely determined 
by the “coefficients” c;, and each “coefficient” c; should be associated 
with the element s; of the set S. But an association is nothing but a 
function. This suggests to us how to define “sums” as above. 

For each se S and each number c we define the symbol 


CS 


to be the function which associates c to s and 0 to z for any element 
zeéS,z#s. If b, c are numbers, then clearly 


b(cs) = (bc)s and (b + c)s = bs + cs. 


We let T be the set of all functions defined on S which can be written in 
the form 
C18; +'° + C,S, 


where c; are numbers, and s; are distinct elements of S. Note that 
we have no problem now about addition, since we know how to add 
functions. 


We contend that if s,,...,s, are distinct elements of S, then 
ls,,...,1s, 
are linearly independent. To prove this, suppose c,, ...,c, are numbers 
such that 
C18, +°*' + 0,5, =90 (the zero function). 


Then by definition, the left-hand side takes on the value c; at s; and 
hence c; = 0. This proves the desired linear independence. 

In practice, it is convenient to abbreviate the notation, and to write 
simply s; instead of 1s;. The elements of T, which are called formal linear 
combinations of elements of S, can be expressed in the form 


C,S; to'' + C,S,, 


and any given element has a unique such expression, because of the linear 
independence of s,, ...,5,. This justifies our terminology. 


We now come to the statements concerning multilinear alternating 
products. Let E, F be vector spaces over R. As before, let 


EXO =Ex-+-x E, 
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taken r times. Let 
f: EC >F 


be an r-multilinear alternating map. Let v,, ...,v, be linearly indepen- 
dent elements of E. Let A = (a,;) be an r x n matrix and let 


U, = a,,0, + °°' + a,,0, 


u, —= Ay, V4 + —_ + Ayn Vy - 
Then 


S (Uys... Uy) = f(Aq 10, Ht A yy Vq, -.. 5,40, + °°" + Ayyd,) 


= y F (G4 o(1yPora)> cee sp g(r) Voir) 
a 


= »y Qi 61) °°" Ay ar) t (Vo(1)> ++ 9Vgqr) 
Co 


where the sum is taken over all maps o: {1,...,r} > {1,...,n}. In this 
sum, all terms will be 0 whenever o is not an injective mapping, that is 
whenever there is some pair i, j with i 4 j such that o(i) = o(j), because 
of the alternating property of f. From now on, we consider only injec- 
tive maps o. Then {o(1),...,0(r)} is simply a permutation of some r-tuple 
(i,,...,i,) with i, <--- <i,. 


We wish to rewrite this sum in terms of a determinant. 


For each subset S of {1,...,n} consisting of precisely r elements, we 
can take the r x r submatrix of A consisting of those elements a,; such 
that je S. We denote by 


Det,(A) 


the determinant of this submatrix. We also call it the subdeterminant of 
A corresponding to the set S. We denote by P(S) the set of maps 


o: {1,...,r} > {1,...,n} 
whose image is precisely the set S. Then 


Det,(A) = y Es() 1 o(1) "Fron? 


oé P(S) 


where é,(o) 1s the sign of o, depending only on o. In terms of this 
notation, we can write our expression for f(u,,...,u,) in the form 


(1) P(uy,--.5U,) = » Dets(A)f(vs) 
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where vs denotes (v;,,...,v;,) if i, <-:: <i, are the elements of the set S. 
The first sum over S is taken over all subsets of 1, ...,n having precisely 
r elements. 


Theorem A. Let E be a vector space over R, of dimension n. Let r be 
an integer 1 Sr<n. There exists a finite dimensional space /\' E and 
an r-multilinear alternating map E" — /\’ E denoted by 


(u,,...,U,) OU, Avr AU, 


satisfying the following properties: 


AP 1. If F is a vector space over R and g: E” > F is an r-multilinear 
alternating map, then there exists a unique linear map 


Fx: N E-F 
such that for all u,, ...,u,€E we have 
g(u,,...,U,) = gy(Uy A*** A U,). 
AP 2. If {v,,...,0,} is a basis of E, then the set of elements 
VU, ATAU, 1si,<-°''<i,Sn, 
is a basis of |’ E. 


Proof. For each subset S of {1,...,n} consisting of precisely r ele- 
ments, we select a letter ts. As explained at the beginning of the section, 
these letters ts; form a basis of a vector space whose dimension is equal 


. . . n . . . 
to the binomial coefficient ( ) It is the space of formal linear combina- 
r 


tions of these letters. Instead of ts, we could also write t= 1¢,,..;,, with 
i, <‘''<i,. Let {v,,...,v,} be a basis of E and let u,, ...,u, be elements 
of E. Let A =(a,;) be the matrix of numbers such that 


u, = 414104 + se. + QAinYn 


Up = Any Vy + + ArgVy- 
Define 
U, Ac AU, = y, Dets(A)ts. 


We contend that this product has the required properties. 
The fact that it is multilinear and alternating simply follows from the 
corresponding property of the determinant. 
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We note that if S = {i,,...,i,$ with i, <---<i,, then 


fs = Uj, Att NA Uj 


A standard theorem on linear maps asserts that there always exists a 
unique linear map having prescribed values on basis elements. In partic- 
ular, if g: E> F is a multilinear alternating map, then there exists a 
unique linear map 


De: N EF 
such that for each set S, we have 
Jx(ts) = g(v;, gee ,V; ) 


if i,,...,i, are as above. By formula (1), it follows that 


g(u,,...5U,) = g,(uy Nv" A u,) 


for all elements u,, ...,u, of E. This proves AP 1. 

As for AP 2, let {w,,...,w,} be a basis of E. From the expansion of 
(1), it follows that the elements {ws}, i.e. the elements {w,;, A--: A w;} 
with all possible choices of r-tuples (i,,...,i,) satisfying i, <--: <i, are 


generators of /\"E. The number of such elements is precisely 


Hence they must be linearly independent, and form a basis of /\’ E, as 
was to be shown. 


Theorem B. For each pair of positive integers (r,s) there exists a 
unique bilinear map 


NEx MESNSE 
such that if u,,...,U,, W,, ...,.W,€ E then 
(u, Avt° A U,) X (Wy Aco AW) Fe Uy Act AU, A Wy Att A Wy. 
This product is associative. 


Proof. For each r-tuple (u,,...,u,) consider the map of E into 
/\'** E given by 


(Wi, ---,We) FOU, ATT AULA Wy AoA Wy. 


This map is obviously s-multilinear and alternating. Consequently, by 
AP 1 of Theorem A, there exists a unique linear map 
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such that for any elements w,, ...,w,é E we have 


Juy(Wy AoA We) = Uy ATAU, A Wy Att A Wy. 


Ss 


Now the association (u)t>g,,) is clearly an r-multilinear alternating map 
of E® into L(/\E, /(\'** E), and again by AP1 of Theorem A, there 
exists a unique linear map 


Dy: N E > Li(/\ E, I\*s E) 


such that for all elements u,, ...,u,¢ E we have 


Gu, peeey Uy, = Gy (Uy Ac" A u,). 


To obtain the desired product /\" E x /A* E- /\'** E, we simply take the 
association 


(co, W)t> g,(@) (YW). 


It is bilinear, and is uniquely determined since elements of the form 
u, A+: Au, generate /\’ E, and elements of the form w, A--- A w, gen- 
erate /\‘ E. This product is associative, as one sees at once on decom- 
posable elements, and then on all elements by linearity. This proves 
Theorem B. 


Let E, F be vector spaces, finite dimensional over R, and let 1: E ~F 
be a linear map. If uw: FR is an element of the dual space F*, ie. a 
linear map of F into R, then we may form the composite linear map 


pos: ER 
which we visualize as 
ESFAR. 
We denote this composite wo A by A*(y). It is an element of E*. 


Theorem C. Let 4: E-—F be a linear map. For each r there exists a 
unique linear map 


A*: /\' F* > /\’ E* 
having the following properties: 


(i) A*(@ A W) = AM@) A A*(h), for w € /\' F*, We /\o F* 
(ii) If we F* then A*(y) = wo A, and 4* is the identity on /\° F* =R. 


Proof. The composition of mappings 


F* x +++ x F* = F*O _, E*¥ yx +e) x E* = E* — /\" E* 
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given by 
(Lu, ++ y fy) > (My ° A, -+ +f, © A) (My o A) A" A (U1, © A) 


is obviously multilinear and alternating. Hence there exists a unique 
linear map /\’ F* > /\’ E* such that 


By Av A Uy AR(My) Avo A APU). 


Property (i) now follows by linearity and the fact that decomposable 
elements up, A +*: A uw, generate /\’ F*. Property (ii) comes from the de- 
finition. This proves Theorem C. 


CHAPTER XxXil 


Manifolds 


XXII, §1. ATLASES, CHARTS, MORPHISMS 


Let X be a set. An atlas of class C? (p20) on X is a family of pairs 
{(U;, g;)} (ie I) satisfying the following conditions: 


AT 1. Each U, is a subset of X and the U,; cover X. 


AT 2. Each @, is a bijection of U,; onto an open subset of a euclidean 
space E, and for every pair i, j of indices, the set p(U;.U;) is 
open in E. 


AT 3. The map 
Pj ° Q;: e(U; 0 U;) - p(U; -™” U;) 


is a C?-isomorphism for each pair of indices i, j. 


The space E is assumed to be the same for all i. If its dimension is n, 
we say that the atlas is n-dimensional. All that is done in this chapter 
would go over to the Banach case, but the principal applications we have 
in mind in the next chapter are strictly finite dimensional, and so for a 
first introduction to manifolds here we make the finite dimensionality 
assumption at once. Readers are referred to [La 2] for the general 
development, in a systematic way. They will note that there is essentially 
no change from the partial development given here. 

Each pair (U;, 9;) will be called a chart of the atlas. We see that the 
inverse map 


Q;: p,U; > U; 
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may be interpreted as a parametrization of a portion of X by an open 
set in euclidean space. Thus in particular, X is a set which can be 
covered by subsets, each of which is so parametrized. The extra condi- 
tion AT 3 is one which will allow us to speak of differentiability relative 
to X itself. 


Since g: U > R" is a map into n-space, we can represent @ by coordi- 
nate functions, and we can write for x € U, 


p(x) = (~, (x), vee ,D,(X)) = (x, gree »Xn): 


We call (x,,...,X,) the local coordinates of x in the chart (U,@). The 
notation here is already somewhat concise, but useful. If readers feel the 
need for it, they may extend this notation as follows. Denote a point 
of X by P. Then in a chart g: U-R" at P, we have coordinates 
(x,(P),...,x,(P)) for the point o(P), P « U, and we abbreviate this n-tuple 
by x(P). In most cases, it is a useful abbreviation to do away with the 
extra letter. 

Let U be a subset of X and let g: UU be a bijection of U onto 
an open subset of E. We say that the pair (U, @) is compatible with the 
atlas {(U;,, g;)} if each map ¢,g~* (defined on a suitable intersection as in 
AT 3) is a C?-isomorphism. Two atlases are said to be compatible if each 
chart of one is compatible with the other atlas. The relation of compati- 
bility between atlases is immediately verified to be an equivalence rela- 
tion. An equivalence class of C?-atlases on X is said to define a structure 
of C?-manifold on X. The number n being fixed, we say that X is then 
an n-dimensional manifold. 
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If (U, g) and (V, W) are two charts of a manifold, then we shall call the 
map gow (defined on W(UNV), whenever UXO V is not empty) a 
transition map. 

So far we have not assumed that X has a topology. In many cases, a 
topology is first given, and then to make the atlases topologically com- 
patible with this topology, one can require the additional condition that 
the maps @ of the charts be homeomorphisms. However, it is also useful 
not to do this and deal with the more general situation when X is 
merely a set, and we shall in fact have an important application later 
when we deal with the tangent bundle. 

We shall now see how to define a topology on X by means of the 
atlases. Let {(U,, @,)} be an atlas. A subset U of X is defined to be open 
if and only if the intersection UU, with each open set of the atlas is 
such that g(UoU,) is open, in E of course. It is a trivial exercise to 
verify that this defines a topology. Furthermore, if {(V,, w;)} is an equiva- 
lent atlas, then the two topologies coincide. We leave the formal verifica- 
tion to the reader. We note merely that the basic reason is that if a 
point x lies in charts U; and V,, then there is a subset W containing x 
such that g,W and w,W are open. Since a topology is really determined 
locally (ic. an open set is a union of open neighborhoods of its points) 
one sees at once that a set is open relative to one atlas if and only if it is 
open relative to the other. 

Let X be a manifold, and U an open subset of X. Then it is possible, 
in the obvious way, to induce a manifold structure on U, by taking as 
atlases the intersections 


(U;0 U, @;|\(U;O U)). 


Example 1. Any open set in euclidean space is a manifold, the charts 
being the obvious ones: C?-isomorphisms of open subsets onto other 
open sets in euclidean space. 


Example 2. We speak of R/Z as the circle group. Then R/Z is a 
compact manifold, for which we can find an atlas consisting of two 
charts. The open interval (0, 1) maps bijectively onto an open subset of 
R/Z (by assigning to each real number its equivalence class modulo Z), 
and the open interval (—4,4) also maps bijectively onto an open subset 
of R/Z. Readers will verify at once that these two maps are the charts of 


a C™ atlas. 


Example 3. Instead of R/Z we can take R"/Z", the n-dimensional 
torus, and define charts similarly. 


Example 4. Let S” be the n-sphere in R"*’, ie. the set of all points 
(X,,..-,X,4,) such that 


xp + Xa = 1. 
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Then S" is a manifold, if we define charts as follows. Let 
f(*1; + +5Xn41) = xt te Xr: 


The sphere is the set of points x such that f(x)=1. For any point 
ae S", a=(d,,..-54,+1), Some coordinate is not equal to 0, say a,. Then 


D, f(a) # 9, 


and we can apply the implicit function theorem, so that there is a C™ 
map ¢@, defined on an open neighborhood U of (a2, ...,a,4,) such that 


F(@1(x), X25 Xn) = | 


and @,(da5,...,4,4,) = 4 ,. Furthermore, if we take U small enough, then 
@, is uniquely determined. Let 


p(x. gece Xn+1) — (9, (x), XQo 205 Xn4i): 


It is an exercise to verify that the collection of all similar pairs (pU, o~') 
is a C® atlas for S". Actually, we shall obtain some theorems below 
which will prove this, and give general criteria showing that certain 
subsets of euclidean space are manifolds. 


In our definition of a manifold, it was convenient to take the charts as 
maps from the set X into the vector space. In our examples, we actually 
defined their inverses. We may visualize a manifold as a set which is 
parametrized locally by open subsets of some euclidean space. The pa- 
rametrizing maps are the inverse maps of the charts. The whole point 
of condition AT 3 is to ensure that the parametrizations are compatible 
with a certain order of differentiability. 


Example 5. Let X = R and let o: X >R be the map g(x) =x°. Then 
(X,q@) is a chart defining an atlas. We therefore get a differentiable 
structure on R, but the identity map is not C’ compatible with this atlas, 
because the map xt» x"? is not differentiable at 0. 


Let X, Y be manifolds. Then the product X x Y is a manifold in an 
obvious way. If {(U;, g;)} and {(V;, w,)} are atlases for X, Y, respectively, 
then 


{(U; x Vis Di X w;)} 


is an atlas for the product, and the product of compatible atlases gives 
rise to compatible atlases, so that we do get a well-defined product 
manifold. 
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We know what it means for a map from an open set in euclidean 
space into another euclidean space to be differentiable, or of class C?. 
Since our definition of a manifold is based locally on open sets in eucli- 
dean space, we can now define the notion of a C? map from one mani- 
fold into another. Let X, Y be C?-manifolds and f: X — Y, a map. We 
say that f is a C? map if given x e X there exists a chart (U, m) at x and 
a chart (V, W) at f(x) such that f(U) c V and such that the map 


fuv = Wofog': QU >WV 


is a C? map. If this holds, then this same condition holds for any choice 
of charts (U, @) at x and (V, W) at f(x) such that f(U) c V. 

It is clear that the composite of two C? maps 1s itself a C? map 
(because it is true for open subsets of euclidean space). 

It should be noted that C? manifolds and maps are useful with p finite 
because Banach space techniques can be applied to sets of mappings. 
Indeed, the C?-bounded maps of one open set of euclidean space into 
another form a Banach space. Manifold theory goes through if instead 
of euclidean space we take a Banach space in the definition of a mani- 
fold, and one can then give a manifold structure to the set of C? maps of 
one manifold into another. We don’t go into this aspect of manifold 
theory in this book, but readers should keep this possibility in mind for 
future applications. 

We shall deal with a fixed p throughout a discussion. Thus it is 
convenient to call a C? map f: X > Y by a neutral name, and we call 
such maps morphisms. (If the order of differentiability needs to be spe- 
cified, we can always add the C? prefix.) By a C?-isomorphism, or simply 
isomorphism f: X ~ Y we mean a morphism for which there exists an 
inverse morphism, i.e. a morphism g: Y ~ X such that go f and fog are 
the identity mappings of X and Y, respectively. This is the same termi- 
nology which we used with respect to open sets in Euclidean spaces. 
Similarly, we have the notion of a local C?-isomorphism at a point x € X, 
meaning that f induces an isomorphism of an open neighborhood of x in 
X onto an open neighborhood of f(x) in Y. 
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A manifold may arise like the torus, not embedded in any particular 
euclidean space, or it may be given as a subset of some euclidean space 
like the sphere. We now study this second possibility. 

Let X be a topological space and Y a subspace. We say that Y is 
locally closed in X if every point ye Y has an open neighborhood U in 
X such that YOU 1s closed in U. We leave it to the reader to verify 
that a locally closed subset of X is the intersection of an open set and a 
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closed set in X. For instance any open subset of X is locally closed, and 
any open interval is locally closed in the plane. 

Let X be a manifold and Y a subset. We shall say that Y is a 
submanifold if, roughly speaking at each point ye Y there exists a chart 
such that in this chart, the points of Y correspond to a factor in a 
product space. We now make this condition precise as follows. For each 
y € Y there exists a chart (V, g) at y such that @ gives an isomorphism 


o: VV, x Vy, 


where V, is open in some space E,, and V, is open in some space E,, 
and such that 
QYOV)=V, x {az} 


for some point a,¢V,. If we make a translation on V,, it is clear that 
we can always adjust V, such that a, = 0. 

If we let E= E, x E,, then the coordinates split up naturally, namely 
we can write R"” = R”™ x R‘4, and 


(X45 00+9Xp) = (Xy5 066 Xne Xmti. ++ oXmeq)s 


If a, = 0 in our preceding definition, then we see that the points of Y in 
the given chart (V, W) are precisely the points having coordinates 


(X15 -++sXp» 0, ---50). 


All of this explains what we said about Y being locally at each point a 
factor in a product. 

We observe that if Y is a submanifold of X, then Y is locally closed in 
X. We must also justify our terminology by showing that Y is a mani- 
fold in its own right. Indeed, if (V, @) is a chart at y as in our definition, 
then @ induces a bijection 


Qi: YAVV,. 


The collection of pairs {(Y OV, ~,)} obtained in the above manner con- 
stitutes an atlas for Y, of class C?. 


The proof of this statement is essentially a triviality, and consists merely 
in keeping the definitions straight. We give it in full. Let 


Ww: WW, x W, 
be another chart at y such that 


WY OW) = W, x {bo} 
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for some point b,e¢W,. Then we get the bijection ¥,: YOW-W,. 
Furthermore, 


p(Y AV AW) is open in V, x {a,} and thus equal to V/ x {a,} 
for some open V/ in V,. Similarly, 

W(Y AV W) is open in W, x {b,} and thus equal to W x {b,} 
for some open W; in W,. We have isomorphisms 

VAWSV'CV,xV, and VaAWSW'CW,xW, 
under @ and yY, respectively, and thus 
gow’ 1: W' 3 V’ 

is an isomorphism. If we look at the effect of this isomorphism on the 


part of W’ corresponding to Y, we see that it simply induces by restric- 
tion a map 


Wi x {arp > Vy x {b3} 
whence a map W,-—V, which is of class C’, and has a C? inverse, 
induced by 0 g’"': V'>W’. This proves what we wanted, i.e. that the 
family of all pairs {(Y 7 V, @)} is an atlas for Y. 


The proof is based on the following obvious fact, which it is useful to 
keep in mind when dealing with submanifolds. 


Let V,, V,, W,, W, be open subsets of euclidean spaces, and let 
g: V, x V; > W, x W, 


be a C? map. Let a, € Vz, b, € Wy, and assume that g maps V, x {a>} 
into W, x {b,}. Then the induced map 


g,: V, 7 W, 


is also a C? map. 
Indeed, it is obtained as a composite map 
Vi, 7 Vi, x V2 > W, x WLR W,, 


the first map being an injection of V, as a factor, and the third map a 
projection on the first factor. 
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The following statement has a proof based on the same principle. 


Let Y be a submanifold of X and let f:Z—+xX be a map from a 
manifold Z into X such that f(Z) is contained in Y. Let fy: Z— Y be 
the induced map. Then f is a morphism if and only if fy is a morphism. 


We leave the proof to the reader. 

We observe that if Y is a submanifold of X, then the inclusion map of 
Y into X is a morphism. If Y is also a closed subset of X, then we say 
that Y is a closed submanifold. 


Theorem 2.1. Let U be open in R" and let f: UR be a C” function, 
p21. Let 


a = (a,,...,a,) € U, 
and assume that D, f(a) #0. Then the map 
D(X 45 06 rXp)P (X15 + Xp-as £09) 


is a local C? isomorphism at a, i.e. is a chart at a. If f(a)=c, then 
locally at a, the inverse image f~*(c) is a submanifold of R". 


Proof. The Jacobian matrix of @ at a is the matrix 


it 0 O 0 0 

0 1 O 0 0 

0 0 0 1 0 
D, f(a) ve te tt Dif (a) 


and its determinant is D, f(a) #0. The inverse mapping theorem shows 
that @ is a local C? isomorphism at a, thus proving Theorem 2.1. 


Corollary 2.2. Let Y be the subset of U consisting of all points x such 
that f(x) =c. Then there exists an open neighborhood V of a such that 
YO V is a submanifold of X. 


Proof. We take V such that @ is a C? isomorphism on V. Those 
points of Y correspond to the points such that 


((X) = (X15 ---5Xp_—15 C)- 
If g is the inverse mapping of g, then the map 


(X15 00+ 5Xp—p EP G(X 5 - ++ Xp-15 0) 
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is the inverse parametrization of a chart, and @ restricted to V7 Y maps 
Vc Y in a factor of a product, as desired. 


Example. The map f(x) = x7 +°::+ x? and c=1 give the sphere in 
the preceding corollary, so that the sphere is a submanifold of R". For 
any point of the sphere, some coordinate is not equal to 0, and the 
partial derivative of f at that point is not 0, so that the corollary applies. 


The argument proving Theorem 2.1 can easily be generalized to cover 
other cases in which we can prove that a certain subset is a submanifold. 
We shall formulate these criteria, which involve the derivative as linear 
map. We shall not use them later in this book, and the reader may omit 
them without harm. For further applications and terminology concerning 
this, cf. books on differentiable manifolds, e.g. [La 2]. 


Let U be open in E and let f: U-F be a C? morphism with p 2 1. 
Let x»€U. Assume that f'(x,) is a linear isomorphism of E onto a 
subspace F, of F. Let F=F,@®F,. Then there exists a local C?’ 
isomorphism g: F > F, x F, at f(x 9) and an open subset U, of U con- 
taining X,. such that the composite map go f induces a C? isomorphism 
of U, onto an open subset of F,. 


Proof. Consider the map 


o:Ux F,->F, x F, 
given by 
P(X, y2) = (f(x), 0) + (0, y2). 


/ _ f'(Xo) ° 
oo. 72) =( 0 L 


Then 


where we use the matrix representation of a linear map of E x F, into 
F, x F,. For v,¢£E and v, € F, we have 


(7 2) (*") _ (a + its) 

Arr 422) \V2 Az1¥1 + Az202 

In this representation, we view f’(x,.) as a linear map of E onto F,. We 
see that @’(x,,0) is a linear isomorphism between E x F, and F, x F,. 
By the inverse mapping theorem, we conclude that @ is a local C? 


isomorphism at (x ),0). We let w be its local C? inverse. Then it is 
obvious that w induces a map g to satisfy our requirements. 


In the preceding result, we may view f as parametrizing a subset of F, 
say locally at x) by U, > f(U,). The lemma shows that there is a chart 
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at f(x.) in f(U,) which maps f(U,) into a factor in the product F, x F,. 
Note that if we write E = R? and F = R’, then the subspace F, of F is 
not necessarily equal to R? in its usual embedding in R” as the space of 
the first q coordinates. The subspace F, can be quite arbitrary. How- 
ever, we can find a complementary subspace F,, and then a basis of F, 
and of F, in such a way that if we take coordinates with respect to this 
basis, then the coordinates of g(f(U,)) are precisely the coordinates 
(x,,...,X,,0,...,0). In our geometric terminology, we can say that f(U,) 
is a submanifold of F. 

The next result deals with the dual situation, where instead of an 
injection we deal with a projection. If we have a map 


V, x V, > F, 


then we shall say that this map is a projection (on the first factor) if this 
map can be expressed as a composite 


V,xV,7V,7F 


of the actual projection on the first factor, followed by a map of V, 
into F. 


Let U be open in E and let a be a point of U. Let f: U-F be aC? 
map, p21. Assume that the derivative f'(a): E-—F is surjective. Let 
E, be a subspace of E such that f'(a) induces a linear isomorphism of 
E, with F, and let E, be a complementary subspace to E, in E, that is 
E=E, x E,. Then the map 


(x, ) X2)r>(x, ’ f(x; ’ x3)) 
is a local C? isomorphism at a. 


Proof. The derivative of the map at a is represented by the matrix 


lofi vasa) 
D, f(a) D2 f(a) 


and is therefore invertible at a = (a,, a,) because D, f(a) by definition is a 
linear isomorphism of E, with F. The inverse mapping theorem shows 
that our map is locally C? invertible at a, as was to be shown. 


In particular, let ce F, and consider those points xeU such that 
f(x) =c, 1.e. such that f has constant value c. If v,¢£E, is such that 
f(vz) =c (and v, is close to a,), then the inverse image f~'(c) corre- 
sponds to a factor V, x {v,} in E, x E, locally near a. One of the 
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most important examples is that of a function, which we treated in 
Theorem 2.1. 


XXII, §3. TANGENT SPACES 


Let X be a C? manifold (p 2 1). Let x be a point of X. We then have a 
representation of x in every chart at x, which maps an open neighbor- 
hood of x into a euclidean space E. We consider triples (U, @, v) where 
(U, @) is a chart at x, and v is a vector in the vector space in which @U 
lies. We say that two such triples (U, g, v) and (V, W, w) are equivalent if 
the derivative of yom‘ at ox maps v on w. The formula reads 


(Wop *)(px)v = w. 


This is obviously an equivalence relation by the chain rule. An equiva- 
lence class of such triples is called a tangent vector of X at x. Thus we 
represent a tangent vector much the same way that we represent a point 
of X, by its representation relative to charts. The set of such tangent 
vectors is called the tangent space of X at x and is denoted by T,X). 
Each chart (U, @m) determines a bijection of T,(X) on a euclidean space, 
namely the equivalence class of (U, @, v) corresponds to the vector v. 
Suppose that X is a manifold. Then each derivative 


(Woo) (gx): E+E 


is an invertible linear map. Let v,, v, be vectors representing tangent 
vectors 0,, U, in the chart (U,q@), and let w,, w, represent the same 
tangent vectors in (V, W). Then by definition 


w; = (po p*) (px)u;. 


From this we see that v, + v, and w, + w, represent the same tangent 
vector, and that if ce R, then cv, and cw, represent the same tangent 
vector. Thus we can define addition and multiplication by numbers in 
T,(X) in such a way that 


Vv, +0, =0, 40, and cv, =C 1° 
Then T,(X) is a vector space, and the map 


vrF> Uv 


is a linear isomorphism of E onto T;(X). 
The derivative of a map defined on open sets of Euclidean spaces can 
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now be interpreted on manifolds. Let f: X > Y be a C? map and let 
x eX. We define the tangent map at x, 


Tf: T(x) > Th)(Y) 


as the unique linear map having the following property: If (U, @) is a 
chart at x and (V,wW) is a chart at f(x) such that f(U) c V, and 0 is a 
tangent vector at x represented by v in the chart (U, @), then 


TJ (0) 


is the tangent vector at f(x) represented by Df, ,(x)v. It is immediately 
verified that there does exist such a unique linear map. The tangent 
linear map is also occasionally denoted by df,, and is also called the 
differential of f at x. The representation of T;,f on the spaces of charts 
can be given in the form of a diagram. 


T(x) -—— E 


Tif | } u,v(x) 


Here of course, F is the space in which w(V) lies. 
If f: X > Y and g: Y > Z are two C” maps, then the chain rule can be 
expressed by the formula 


T.(g 0 f) = (Tyg) 0 (TA). 


In particular, suppose that Y is a submanifold of X, and let xe Y bea 
point of Y. Then we have the inclusion map 


ji YX 
which induces an injective linear map 
Tj: T,(Y) > T.(X), 


whose image is a subspace of T,(X). This is the situation which 1s 
usually depicted by the following picture: 


[<y 
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Here X = E is the whole vector space. Suppose that Y is a submanifold 
of E and let xe Y. Let 


WiiV, 7 Y 


be a local isomorphism of some open V, in a space F with Y, at a point 
y, € V, such that W,(y) =x. Let us view wy, as a map of V, into E. Then 


Wily): FE 


is an injective linear map, whose image is a subspace E, of E. One can 
verify directly, or from the abstract fact that T,y is defined, that if 


W,:V,7Y 


is a local isomorphism of some open V, in F with Y, at a point y, eV, 
such that w,(y,) = x also, then the image of w5(y,) is in fact equal to Eo. 
This subspace E, is the translation of the “tangent space” drawn on the 
picture. In fact, the tangent space drawn on the picture consists of all 
pairs (x, v) with ve E,. We view each such pair (x, v) as a located vector, 
starting at x and ending at x + v. 

The collection of tangent spaces, namely the union of all T,(X) for all 
x € X, will be called the tangent bundle of X, and will be denoted by 
T(X). We can in fact make T(X) into a C?"' manifold by giving natural 
charts for it as follows. 

We have a natural map 


tm: T(X) > X 


which maps each tangent space T,(X) on the point x of X. We call z the 
natural projection. Let (U, @) be a chart of X, with @U is open in E. We 
then obtain a map 
Ty: 1 '(U) > QU x E 
defined by 
To(v) = (px, v) 


if n(v) = x and 0 is a tangent vector at x, represented by v in E, with 
respect to the chart. In fact, it is clear that t, is a bijection. 
Let (U, o) and (V, w) be two charts. We have 


ni(U)an'(V)=na'(UNY). 
We obtain a transition mapping 
t,oty': PU OV) x Ea W(UNV) x E, 


by 
(px, v) + (Wx, Dw o p")(x)v) 
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for xe UAV and ve E. Since the derivative D( op) is of class C?™* 
and is a linear isomorphism at x, we conclude that our family of maps 
{t,}, for (U, p) ranging over all charts of X, is an atlas for T(X), and 
therefore that T(X) is a C?"! manifold, as we predicted it would be. 

We call each chart (x~*U, t,) a trivializing chart of T(X), over the open 
set U. Locally, we see that each such trivializing chart for T(X) gives an 
isomorphism of the tangent bundle over U with a product pU x E. 

Let f: X ~ Y be a C” morphism, p 2 1. We can then define a tangent 
map 

Tf: T(X) > T(Y) 


to be simply the map equal to 
Tf: T,(X) > Ty (Y) 


on the tangent space at x. It is immediately clear from the way in which 
we defined the charts for the tangent bundle that Tf is a C?* morphism. 
Over an open set U of X with chart (U, ~), suppose that f maps U into 
an open set V of Y, with chart (V,W). We can represent Tf as the 
derivative as on the following diagram: 


nm 1(U) —_*_, QgUXE 


n(V) —-—> WV x F. 


The map on the right can be viewed as the pair (fy.y, fu,y). 
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Let X be a C? manifold, p= 0. By a C? function on X we shall always 
mean a morphism of X into R of class C? (unless otherwise specified, 
when we take complex valued functions). The functions form a ring 
C?(X). As usual, the support of a function is the closure of the set of 
points x such that f(x) 40. 

Let X be a topological space. A covering of X is called locally finite if 
every point of X has a neighborhood which intersects only finitely many 
elements of the covering. A refinement {V,} of a covering {U,} of X is a 
covering such that each V, is contained in some U;. We also say that the 
covering {V,} is subordinated to the covering {U;}. 

A partition of unity (of class C’) on a manifold X consists of an open 
covering {U;} of X and a family of C? functions 


W; X oR 
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satisfying the following conditions: 


PU 1. For all x € X, we have w(x) 2 0. 

PU 2. The support of w; is contained in U;. 

PU 3. The covering is locally finite. 

PU 4. For each point x €« X we have >) W(x) = 1. 


(The sum is taken over all i, but is in fact finite for any given point x in 
view of PU 3.) 

As a matter of notation, we often write that {(U;, y,)} is a partition of 
unity if it satisfies the previous four conditions. 


Theorem 4.1. Let X be a manifold which is Hausdorff and whose 
topology has a countable base. Given an open covering UY of X, there 
exists an atlas {(V,, p,)} such that the covering {V,} is locally finite and 
subordinated to the given covering %, such that @,V, is the open ball 
B,(0) of radius 3 centered at 0, and such that the open sets W,= 
~, '(B,) cover X (where B, is the open ball of radius 1 centered at 0). 


Proof. Let U,, U,,... be a basis for the open sets of X such that each 
U; is compact. We can find such a basis since X is locally compact. We 
construct inductively a sequence A,, A,,... of compact sets whose union 
is X, such that A; is contained in the interior of A;,,. We start with 
A, =U,. Suppose that we have constructed A,, ...,4;. Let j be the 
smallest integer such that A; is contained in U, U-:-UU;. We let A;,; be 
the closed and compact set 


U,u' uuu U.45- 


This gives our desired sequence of compact sets. 

For each point x € X we can find an arbitrarily small chart (V,, @,) at 
x such that g,V, is the open ball of radius 3 centered at 0. We can 
therefore assume that V, is contained in some open set of the covering @. 
As for the statement concerning the ball of radius 3, we can always 
shrink our open set V, so that its image is a ball, and then adjust the 
image by a translation and multiplication by a positive number to make 
the image exactly equal to B,(0). For each i and each x in the open set 


Int(Aj+2) — Aj-1 


we select V, to be contained in this open set. We let W, = g,'(B,) be 
the inverse image of the ball of radius 1. We can cover the compact set 
(annulus) 


Ais, — Int(Aj) 
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by a nove number of sets W,, ....W,. Let @, denote the family 
{V, myo V,, }, and let @ be the union of all BZ, for all i=1, 2,.... Then 
Bis an . open covering of X, is locally finite, and is subordinated to our 
given covering & It also satisfies the other requirements of the theorem. 


Corollary 4.2. Let X be a C” manifold which is Hausdorff, and whose 
topology has a countable base. Then X has C? partitions of unity 
subordinated to a given covering %. 


Proof. Let {(V,,@,)} be as in the theorem, and let W, = 9, '(B,) be 
as in the theorem. We can find a function yw, of class C? such that 
0<y, < 1, such that w,(x) = 1 for xe W,, and (x) = 0 for x E V,. (The 
proof is recalled below.) We now let 


w=) XK. 


The sum is finite at each point, and we let y, = w,/w. Then {(V,, y,)} is 
the desired partition of unity. 


We now recall the argument giving the function y,. If 0 <a <b, then 

the function defined by 

—1 
Pi —ab—t 

in the open interval a<t<b and 0 outside the interval determines a 
bell-shaped C® function from R to R. Its integral from —oo to t divided 
by the area under the bell yields a function which lies strictly between 0 
and 1 in the interval a < t < b, is equal to 0 for t < a and is equal to 1 
for t = b. 

We can therefore find a real valued function of a real variable, say 
y(t), such that y(t) = 1 for |t|<1 and y(t) =0 for |t| 21+ 06 with small 
6, and such that O<4 <1. Then n(|x|)? = w(x) gives us a function 
which is equal to 1 on the ball of radius 1 and 0 outside the ball of 
radius 1+ 6. (We denote by | | the euclidean norm.) This function can 
then be transported to the manifold by any given chart whose image is 
the ball of radius 3. 


Corollary 4.3. Let A, B be disjoint closed subsets of R", or of a 
manifold X admitting C? partitions of unity subordinate to any given 
open covering. Then there exists a C? function f such that 


O0O<f <1, f=1lonA, f=OonB. 


Proof. For each xéA let U, be an open neighborhood of x not 
intersecting B. Let {a;} be a partition of unity subordinate to the cover- 
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ing consisting of @A and {U,},..,. Let J be the set of those indices j 
such that supp a, < U,,, for some x(j)e¢ A. Let 


f=)> yw. 


jeJ 


For any x¢X there is only a finite number of functions a; such that 
a,(x) #0, so our sum expressing f is actually a finite sum. If xe A and 
a; has support in @A, then «,(x)=0. Hence for each xe A we have 
f(x) =1. If xe B, then a(x) =0 for each jeJ so f(x)=0. For any 
xe X we have 0< f(x) <1 because of the definition of a partition of 
unity and the fact that we take our sum for f only over a subset of the 
indices {j}. This proves our corollary. 


Remark. In some cases, one wants a function f as in the corollary 
with certain bounds on its derivative. One can achieve such bounds by 
being more careful in selecting the a,. For an example of the kind of 
technique used, cf. the end of Chapter XXIII, §6. 


XXII, §5. MANIFOLDS WITH BOUNDARY 


In our applications, we need manifolds with boundary. Let 4: E—>R 
be a functional on E. For instance, if E = R" we may consider 4 =A, 
to be the projection on the n-th coordinate. We denote by E? the 
kernel of A, and by Ej (resp. Ej) the set of points x¢E such that 
Ax = 0 (resp. Ax <0). We call E? the hyperplane determined by 4, and 
we call EZ or E; a half space. The terminology is justified by the 
natural pictures. 


If w is another functional, and Ej = Ej, then there exists a number 
c >0 such that A = cu. 


This is easily proved. Indeed, we see at once that the kernels of 4 and 
u must be equal. Suppose that 140. Let x 9 be such that A(x) > 0. 
Then p(x.) > 0 also. The functional 


A—cu 


where c = A(x,)/u(X_) vanishes on the kernel of A (or yp), and also on Xo. 
Therefore it is the 0 functional and c satisfies our requirement. 


In practice, we shall use mostly coordinate functions as functionals, 
and especially the n-th coordinate function. However, it is reasonable to 
use a slightly more invariant language which exhibits more directly the 
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geometric nature of the forthcoming constructions. We shall be inter- 
ested in figures like the shell of a cylinder: 


If we exclude the two end circles, then what is left is just an ordinary 
manifold. However, if we include the two end circles, then we have an 
object which, at each point of one of the end circles, does not look like 
some open set in 2-space, but rather looks like a point at the boundary 
of a half plane. In other words, we have a parametrization as indicated 
by the following picture: 


We shall formulate the definitions and lemmas which allow us to give a 
formal development for such parametrizations. 


Let E, F be euclidean spaces, and let Ej and F,° be two half spaces in 
E and F, respectively. Let U, V be open subsets of these half spaces 
respectively. We shall say that a mapping 


f: UV 


is of class C? if the following condition is satisfied. Given xe U, there 
exists an open neighborhood U, of x in E, and an open neighborhood V, 
of f(x) in F, and a C? map f,: U, > V, such that the restriction of f, to 
U, OU is equal to f. As usual, we take p 2 1. 

We can now define a manifold with boundary in a manner entirely 
similar to the one used to define manifolds, namely by conditions AT 1, 
AT 2, AT 3, except that we take the U; of an atlas to be open subsets 
of half spaces. The notion of C?-isomorphism is defined as usual by the 
condition of having a C?-inverse. 

We must make some remarks concerning the boundary, and we need 
some lemmas, e.g. to show that the boundary is a “differentiable 
invariant”. 
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Lemma 5.1. Let U be open in E, and let f: U->F and g: U->F be 
two C? maps (p21). Assume that f and g have the same restriction to 
UE} for some half space Ej, and let x e UN Ej. Then f'(x) = g'(x). 


Proof. After considering the difference of f and g, we may assume 
without loss of generality that the restriction of f to Un Ej is 0. It then 
follows that f’(x) =0 because the directions of the half space span the 
whole space. 


Lemma 5.2. Let U be open in E. Let w be a non-zero functional on F, 
and let f: UF be a C? map with p 2 1. Let x be a point of U such 
that f(x) lies in F2. Then f'(x) maps E into F). 


Proof. Without loss of generality, after translations, we may assume 
that x = 0 and f(x) =0. Let W be a given bounded neighborhood of 0 
in F. Suppose that we can find an element ve E such that yf'(0)v 4 0. 
We can write (for small t > 0): 


f (tv) = tf'(O)v + o(t)w, 


with some element w,¢ W. By assumption, f(tv) lies in FY. Applying y, 
we get 
tuf'(O)v + o(t)u(w,) 2 9. 


Dividing by t, this yields 


uf (Oo = wim) 


Replacing t by —t we get a similar inequality on the other side. Letting 
t tend to 0 shows that pf’(0)v = 0, a contradiction. 


Let U be open in some half plane Ej. We define the boundary of U 
(written dU) to be the intersection of U with E>. We define the interior 
of U, written Int(U), to be the complement of 0U in U. Then Int(U) is 
open in E. 


Example. Let Ej be a half space, with 4 #0. Then from our defini- 
tion, we see that this half space is C” isomorphic to a product 


EX ~ E} x R*, 


where R™ is the set of real numbers = 0. The boundary in this case is 
ES x {0}. 


542 MANIFOLDS [XXII, §5] 


Lemma 5.3. Let 4 be a functional on E and ya functional on F. Let 
U be open in Ej and V open in Fy. Assume that UME} and Vo F? 
are not empty. Let f: U-V be a C?-isomorphism (p 2 1). Then 24 40 
if and only if »#0. If 240, then f induces a C?-isomorphism of 
Int(U) on Int(V) and of dU on OV. 


Proof. For each x € U, we conclude from the chain rule that f’(x) is 
invertible. Our first assertion then follows from Lemma 5.2. We also see 
that no interior point of U maps on a boundary point of V and con- 
versely. Thus f induces a bijection of 0U and OV, and a bijection of 
Int(U) on Int(V). Since these interiors are open in their respective spaces, 
it follows that f induces an isomorphism between them. As for the 
boundary, it is a submanifold of the full space, and locally, our definition 
of the derivative, together with the product structure, shows that the 
restriction of f to (U must be an isomorphism on OV. 


We see that Lemma 5.3 gives us the invariance of the boundary under 
C’ maps (p= 1), first for open subsets of half spaces, but then also 
immediately for the boundary of a manifold since the property reduces at 
once to such subsets, under charts. 

We can then describe local coordinates at a point in a manifold with 
boundary as follows. If the point is not a boundary point, then a neigh- 
borhood of this point is described by coordinates (x,,...,x,) im some 
open set of R", which we may even take to contain 0, and such that 0 
corresponds to the given point. 

If the point is a boundary point, then an open neighborhood can be 
described by coordinates (x,,...,X,,) satisfying 


C1, >xX, 24, and X1, +++,X,—-, lying in some open set of R"™*. 


After a translation, we can even achieve an inequality x, 20 instead of 
x, 2a. The points with coordinates x, =0 are precisely those on the 
boundary. This comes from the fact that after a suitable choice of basis 
of R", we can always achieve the result that a functional is simply the 
projection on the first coordinate of a suitable basis. 

Similarly, we can define an embedded k-dimensional submanifold with 
boundary in R", in terms of coordinates. Namely, we say that a subset X 
of R” is such a submanifold if for each x € X there exists an open set U 
in R" containing x, an open set V in R” and a C” isomorphism g: U ~ V 
such that 


p(U NX) =Va(H* x {c}), 


where H* is, say, the half plane in R* defined by x, =a, and c is a point 
(Cy1>+++0C,) in R"™. 
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Example. Consider the cylinder, conveniently placed vertically as follows: 


Z 


We can define a chart for part of the cylinder in terms of the three 
coordinates (r, 8, z) satisfying the inequalities: 


0<@<yz, 


The map wW such that 
Wir, 8, z) = (r cos 6, r sin 8, z) 
is the inverse of the map ¢ in the previous definition. 


The Tangent Spaces. These can be defined much as for a manifold 
without boundary, since the charts and the spaces in which they lie 
can be used as before. For the equivalence classes between vectors, we 
needed the derivative of maps defining the changes of charts, but these 
are well defined independently of the manner in which one extends such 
changes of charts from half spaces to a full open set in the vector space. 


Partitions of Unity. The theorem proved in §4 goes over without 
essential change to manifolds with boundary. Of course, the open balls in 
Theorem 4.1 have to be replaced with their intersections with half spaces. 


XXII, §6. VECTOR FIELDS AND GLOBAL 
DIFFERENTIAL EQUATIONS 


Let X be a manifold without boundary, of class C? with p22. We 
assume that X is Hausdorff. Let 2: T(X)—X be the natural map of its 
tangent bundle onto X. We know that T(X) is a manifold of class C?™, 
and that z is of class C?~’. 
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By a vector field on X we mean a morphism (of class C’~*) 
€: X + T(X) 


such that &(x) lies in the tangent space T,(X) for each x € X, or in other 
words, such that x0 € =idy. Thus a vector field assigns a tangent vector 
to each point. 

When we identify the tangent bundle of an open set U in E with the 
product U x E relative to a chart (U, ~), then we see that a vector field 
corresponds to a map 


U-UxE 
such that 
(x) = (x, f(x)) 


where f: UE is a C?"' map. Thus a vector field is completely deter- 
mined by the map f, which has been studied in Chapter XIV. We call f 
the local representation of the vector field € in the chart (U, 9). 

Let J be an open interval of R. The tangent bundle of J is then 
naturally identifiable with J x R, since the identity map of J is a global 
chart for J. In particular, we can view the number 1 as a tangent vector 
at each point, and we have a constant vector field over J which takes 
this value 1 at all points. 

Let «: J > X be a curve, ie. a map from an open interval J into X. 
Assume that a is of class C'. We want to take the derivative of a. 
Locally at each point of J, we can shrink the domain of definition of « 
to a subinterval J, such that a(Jj) is contained in the domain of defini- 
tion U of a chart (U, ~). Then the composite @ o a is a curve into E, 


Jo > US QU CE. 


We can then take the derivative (go «)(t) for te Jp) as a vector in E. 
This vector represents a tangent vector in T,)(X), and it is immediately 
clear that if we change the chart to another (V, w), then (Wo «)’(t) repre- 
sents the same tangent vector. In this way we obtain a curve which we 
shall denote by a’, into the tangent bundle, namely 


a’: J > T(X), 


which is such that o’(t) lies in T,(X). We shall also write da/dt instead 
of a(t), following standard notation, consistent with previous notation 
when we studied vector fields on open sets of vector spaces. 

We say that o is an integral curve for the vector field ¢ if we have 


a(t) = (a(t) 
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for all te J. If J contains 0 and «(0) = xo, we say that Xx, is the initial 
condition of «. The theorems on differential equations proved in Chapter 
XIV, §3, §4, §5 can now be formulated on manifolds. 


Let a,:J,— X and «,: J, > X be two integral curves of the vector field 
€ on X, with the same initial condition xy. Then a, and a, are equal on 
J, Od). 


Proof. The proof is identical with that of Theorem 3.3 of Chapter 
XIV. 


The preceding result allows us to define an integral curve with given 
initial condition x on a maximal interval J(x). Of course, the local 
existence theorem proved in Chapter XIV shows that such an integral 
curve exists. As before, we let D(&) be the subset of R x X consisting of 
all points (t, x) such that t € J(x). We define a global flow for ¢ to be the 
map 

a: D(E) 3X 


such that for each x e X the map a,: J(x) > X given by 

a(t) = a(t, x) 
is an integral curve for € with initial condition x. When we select a chart 
at a point x of X, then we see that this definition of flow coincides with 
the definition we gave for open sets in euclidean spaces for the local 
representation of our vector field. As in Chapter XIV, we abbreviate 


a(t, x) by tx. 


Theorem 6.1. Let € be a vector field on X and «a its flow. Let xe X. 
If to lies in J(x), then 


J (tox) = J(x) — to 
and we have for all t in J(x) — to: 
t(tox) = (t + to)x. 
Proof. Just like the proof of Theorem 5.1 of Chapter XIV. 
Theorem 6.2. Let & be a vector field of class C?™' on the C? manifold 
X (2< ps oo). Then the domain D(é) is open in R x X and the flow « 


for — is a C?"' morphism. 


Proof. Identical with the proof of Theorem 5.2, Chapter XIV. 


546 MANIFOLDS [XXIT, §6] 


Corollary 6.3. For each teER, the set of xe xX such that (t,x) is 
contained in the domain D(€) is open in X. 


Corollary 6.4. Let D,(é) be the set of points x of X such that (t, x) lies 
in D(é). Then D,(é) is open for each te R, and a, is a C?-isomorphism 
of D,(E) onto an open subset of X. In fact, a,(D,) = D_, and a,* = a_,. 


Proof. Immediate from Theorems 4.1 and 6.1. 
Corollary 6.5. If xX. is a point of X and t is in J(Xq), then there exists 


an open neighborhood U of Xo such that t lies in J(x) for all x € U, and 
the map 


Xr tx 
is an isomorphism of U onto an open neighborhood of txo. 


In the present section, we have given the terminology which allows us 
to discuss differential equations on manifolds. 


CHAPTER XkXIll 


Integration and Measures 
on Manifolds 


Throughout this chapter, unless otherwise specified, we use the word 
manifold to denote manifolds possibly having boundaries. From §3 to the 
end, we let X be a manifold of class C? with p 2 1, which is Hausdorff 
and has a countable base. These last two assumptions are to ensure 
that X admits C?” partitions of unity, subordinated to any given open 
covering. 
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Let X be a C? manifold, with p always 2 1. To each tangent space 
T,(X) = T, we can associate the dual space T;*, and the alternating prod- 
uct /\’ T*. We form the union 


|) A’ TF denoted by /\’ T*(X). 


xeX 
By a differential form on X (of degree r) we shall mean a map 
wo: X > /\' T*(X) 


such that for each x the value w(x) lies in /\’ T*. (We shall add differ- 
entiability conditions in a moment.) The set of differential forms is a 
vector space denoted by '(X). 

If f: X + Y is a C? map of manifolds, then we obtain an induced map 


F*: Q'(Y) > Q(X), 
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just as in the case of subsets of euclidean spaces, and arising from the 
induced linear map at each point, 


Tf: T. > Th. 


(Cf. Theorem C of the Appendix, Chapter XXI, and Theorem 4.1 of 
Chapter XXII.) 

Essentially as we did with tangent vectors, we can find local represen- 
tations of differential forms in the corresponding euclidean space R” = E 
of the manifold X. Indeed, let xe X. Let 


og: U—>@oU c R" 
be a chart at x. We have an isomorphism 


6,: R" > T,,, 
which to each vector ve R" associates the class of (U,9,v). If A is a 
functional on T,, then 40, is a functional on R”. Let w be a 1-form on 
U, and w, the value of w at x. Then @, can be pulled back to R", to 
obtain the form 

05(@,) =W, 6% 


on R". If w is an r-form, and x,, ...,x, are the coordinates of x in R’, 
then there exist functions g,,, on @U such that 


o3(@,) = Y Jiy(X1» oe ->Xp) ax; AcT A dx; , 


and we say that the expression on the right is the local expression of 
determined by the chart, or corresponding to the chart g. We shall also 
commit the abuse of notation, writing g,;)(x) instead of gjj)(x1,..-,Xn): 
We abbreviate 

wf = ok(0,) 
for simplicity. 

If (V, W) is another chart such that UC V is not empty (so that our 
two charts (U,@) and (V,w) may be viewed as charts at a common 
point), then we obtain a representation of wm determined by yw. If @ is a 
1-form, then @, is a functional on T;,, and the pull backs of w, to R" by 
g, and o, respectively can be visualized in the following diagram: 


R" 


(pow (Wx) T.—— R 
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The vertical map on the left is simply the derivative (go y~*)'(Wx), ice. 
the derivative at wx of the transition map gow giving the change of 
charts. In terms of local coordinates, the change in the local representa- 
tion of w, is given in terms of partial derivatives, which are of class C?~’. 
Similarly, for any r-form the change in the local representation is given 
by certain subdeterminants of the Jacobian matrix of (g 0 y~*)'(wx), and 
is again of class C?~'. The most important case is that of an n-form, and 
we can then write w locally as 


w? = g(x) dx, A°:: A dx,. 
If (V, w) is the other chart, then we have explicitly 


of = g(f(y))Ar(y) dy, A ++ A dyy, 


if x = f(y) and f=qgow. As usual, A, is the Jacobian determinant. 

We say that w is of class C?* at a point, if in some local repre- 
sentation relative to a chart at that point, the functions g,,) as above 
are of class C’~!. The remark in the preceding paragraph then shows 
that this will then be true for any local representation relative to any 
chart at that point. We say that w is C?~' if it is of class C?~' at every 
point. 

Theorems 3.1, 4.1, 4.2, 4.3 of Chapter XXI, concerning the operation 
wt+dw, and the inverse image of a form now extend immediately to 
manifolds. In fact, the theorems of Chapter XXI give the expression for 
these operations on the local representation of forms. We define the 
wedge product w A 9 of two forms just as in the local case, according to 
the general algebraic result of Theorem B in the Appendix to Chapter 
XXI. We shall now repeat the statements of the theorems loc. cit. on 
manifolds. 


There exists a unique family of linear maps 
d: AN(X) > Q'*1(X) (r= 0, 1, 2,...) 


defined on the space of r-forms (of class C1, into the space of forms of 
class C1~*), satisfying the properties that if deg w =r, then 


d(w A n)=darnnt+(—lyo a dn, 
and df = Tf if f is a function, i.e. a form of degree 0. 


If f: X > Y is a C? map, p 2 1, then for each r there exists a unique 
linear map 


f*: QV) > QUX) 
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having the following properties: 


(i) For any differential forms w, 4 on Y we have 


f*(@ A n) = f*(@) A f*(n). 


(ii) If g is a function on Y, then f*(g) =g 0 f, and if w is a 1-form then 


(f*a)(x) = of f(x)) © T,f. 


If g: Y->Z is a C? map, then 
(gof)* =f*og*. 


Let f: X > Y be a C? map and w a differential form of class C* on 
Y. Then 


f*(dw) = df*oa. 
In particular, if g is a C' function on Y, then 


f*(dg) = d(g° f). 


Observe that the operation d loses one order of differentiability, and 
that if f is of class C’, then Tf is of class C?~', so that f*q@ has the 
order of differentiability equal to the minimum of that of Tf and o. 

Our definition of the local representation of a differential form in 
terms of local coordinates is compatible with this operation of inverse 
image taken with respect to the map of a chart. In fact, if (U, @) is a 
chart, so that 


og: U-@U 


is a C?-isomorphism of U onto an open set in a half space, and if @ is a 
differential form on U, then we can take 


(p~*)*(@) 


which is a differential form on g@U. The expression in local coordinates 
of w is nothing but the expression of (g~')*(w) taken with respect to the 
identity chart of @U as a subset of R”. In the case of isomorphisms like 
charts, it is useful to use the notation g,q@ instead of the inverse image 
we have just written. 


We can define the support of a differential form as we define the 
support of a function. It is the closure of the set of all xe X such that 
w(x) #0. If w is a form of class C4 and « is a C4 function on X, then we 
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can form the product aw, which is the form whose value at x is a(x)w(x). 
If « has compact support, then aw has compact support. Later, we shall 
study the integration of forms, and reduce this to a local problem by 
means of partitions of unity, in which we multiply a form by functions. 

If X is a manifold and Y a submanifold, then any differential form on 
X induces a form on Y. We can view this as a very special case of the 
inverse image of a form, under the embedding (injection) map 


id: Y— X. 


In particular, if Y has dimension n — 1, and if (x,,...,x,) 18 a system of 
coordinates for X at some point of Y such that the points of Y corre- 
spond to those coordinates satisfying x; = c for some fixed number c, and 
index j, and if the form on X is given in terms of these coordinates by 


w(x) = f(x1,...5X,) dx, A ++: A dx,, 


then the restriction of m to Y (or the form induced on Y) has the 
representation 


“N 
F(X 45 0005 Cy eee 9X y) UX, Avs A dX, N***A dXy, 


where the roof over dx; means dx; is omitted. We should denote this 
induced form by wy, although occasionally we omit the subscript Y. We 
shall use such an induced form especially when Y is the boundary of a 
manifold X. 


XXIll, §2. ORIENTATION 


Let U, V be open sets in half spaces of R" and let 9: UV be a C’ 
isomorphism. We shall say that @ is orientation preserving if the Jaco- 
bian determinant A,(x) is > 0, all xe U. If the Jacobian determinant is 
negative, then we say that @ is orientation reversing. 

Let X be a C” manifold, p = 1, and let {(U;, g;)} be an atlas. We say 
that this atlas is oriented if all transition maps g,°;° are orientation 
preserving. Two atlases {(U,,;)} and {(V,,wW,)} are said to define the 
same orientation, or to be orientation equivalent, if their union is oriented. 
We can also define locally a chart (V, w) to be orientation compatible with 
the oriented atlas {(U;, 0;)} if all transition maps ;° y~' (defined when- 
ever U;7V is not empty) are orientation preserving. An orientation 
equivalence class of oriented atlases is said to define an oriented mani- 
fold, or to be an orientation of the manifold. It is a simple exercise to 
verify that if a connected manifold has an orientation, then it has two 
distinct orientations. 
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The standard examples of the Moebius strip or projective plane show 
that not all manifolds admit orientations. We shall now see that the 
boundary of an oriented manifold with boundary can be given a natural 
orientation. 


Let ow: UR" be an oriented chart at a boundary point of X, such 
that: 


(1) if (x,,...,x,) are the local coordinates of the chart, then the bound- 
ary points correspond to those points in R" satisfying x, = 0; and 

(2) the points of U not in the boundary have coordinates satisfying 
x, < 0. 


Then (x,,...,x,) are the local coordinates for a chart of the boundary, 
namely the restriction of g to 0X OU, and the picture is as follows. 


(Xo, .. 5 Xn) 


We may say that we have considered a chart @ such that the manifold 
lies to the left of its boundary. If readers think of a domain in R?’, 
having a smooth curve for its boundary, as on the following picture, they 
will see that our choice of chart corresponds to what is usually visualized 
as “counterclockwise” orientation. 


The collection of all pairs (UA @X, e|\(U N6X)), chosen according to 
the criteria described above, is obviously an atlas for the boundary 0X, 
and we contend that it is an oriented atlas. 


We prove this easily as follows. If 


(X1,.--,X,) =X and (Vio---Va) =) 


are coordinate systems at a boundary point corresponding to choices 
of charts made according to our specifications, then we can write y = 
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f(x) where f =(/,,...,f,) is the transition mapping. Since we deal with 
oriented charts for X, we know that A,(x)>0 for all x. Since f maps 
boundary into boundary, we have 


f,(O, x,,...,X,) = 90 


for all x,, ...,x,. Consequently the Jacobian matrix of f at a point 
(0, x.,...,X,) 1S equal to 


Dif, (O, X25... 5Xp_) 0-:--0 
2 
xx Aw) ? 


* 


where A") is the Jacobian matrix of the transition map g induced by f 
on the boundary, and given by 


y2 = f2(0, X2> we Xn)s 


Yn = fi,(0, X25 eee Xn): 


However, we have 


h, 9 e209 n 
h-O0 h 


taking the limit with h <0 since by prescription, points of X have coor- 
dinates with x, <0. Furthermore, for the same reason we have 


fi(h, X25 1 0yXpy) < 0. 
Consequently 
D,f,(, X2; 112 Xq) > 0. 


From this it follows that A™~"'(x,,...,x,) > 0, thus proving our assertion 
that the atlas we have defined for 0X is oriented. 


From now on, when we deal with an oriented manifold, it is understood 
that its boundary is taken with orientation described above, and called the 
induced orientation. 


XXIill, §3. THE MEASURE ASSOCIATED WITH 
A DIFFERENTIAL FORM 


Let X be a manifold of class C? with p21. We assume from now on 
that X..is Hausdorff and has a countable base. Then we know that X 
admits C? partitions of unity, subordinated to any given open covering. 
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(Actually, instead of the conditions we assumed, we could just as well 
have assumed the existence of C’ partitions of unity, which is the precise 
condition to be used in the sequel.) 


Theorem 3.1. Let dim X =n and let w be an n-form on X of class C°, 
i.e. continuous. Then there exists a unique positive functional A on 
C.(X) having the following property. If (U, @) is a chart and 


w(x) = f(x) dx, A dx, 


is the local representation of w in this chart, then for any g € CX) with 
support in U, we have 


(1) Ag = | GeIFON dx, 


where g, represents g in the chart [i.e. g,(x) = g(o*(x))], and dx is 
Lebesgue measure. 


Proof. The integral in (1) defines a positive functional on C,(U). The 
change of variables formula shows that if (U,@) and (V,wW) are two 
charts, and if g has support in Uc V, then the value of the functional is 
independent of the choice of charts. Thus we get a positive functional by 
the general localization theorem for measures or functionals (Theorem 5.1 
of Chapter IX, §5), using partitions of unity. 


The positive measure corresponding to the functional in Theorem 3.1 
will be called the measure associated with |w|, and can be denoted by 
Hig 

Theorem 3.1 does not need any orientability assumption. With such 
an assumption, we have a similar theorem, obtained without taking the 
absolute value. 


Theorem 3.2. Let dim X =n and assume that X is oriented. Let w be 

an n-form on X of class C°. Then there exists a unique functional 4 on 

C.(X) having the following property. If (U, @) is an oriented chart and 
w(x) = f(x) dx, A ++: A dx, 


is the local representation of w in this chart, then for any géC,(X) with 
support in U, we have 


Ag = | ; J(x)f (x) dx, 


where g, represents g in the chart, and dx is Lebesgue measure. 
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Proof. Since the Jacobian determinant of transition maps belonging 
to oriented charts is positive, we see that Theorem 3.2 follows like Theo- 
rem 3.1 from the change of variables formula (in which the absolute 
value sign now becomes unnecessary) and the existence of partitions of 
unity. 


If 2 is the functional of Theorem 3.2, we shall call it the functional 
associated with w. For any function g € C,(X), we define 


| gw = Ag. 
X 


If p),(X) is finite, then we know by general theory that we can extend 4 
by continuity to Y'(|m|), where m is the regular complex Borel measure 
associated with 4 (cf. Theorem 4.2, Chapter IX and also Exercise 9 of 
that chapter). If in particular m has compact support, we can also pro- 
ceed directly as follows. Let {a;} be a partition of unity over X such 
that each «; has compact support. We define 


fo-E a6 


all but a finite number of terms in this sum being equal to 0. As usual, 
it is immediately verified that this sum is in fact independent of the 
choice of partition of unity, and in fact, we could just as well use only a 
partition of unity over the support of w. Alternatively, if « is a function 
in C,(X) which is equal to 1 on the support of w, then we could also 


define 
| w -| a. 
x xX 


It is clear that these two possible definitions are equivalent. 

For an interesting theorem at the level of this chapter, see J. Moser’s 
paper “On the volume element on a manifold,” Trans. Amer. Math. Soc. 
120 (December 1965) pp. 286—294. 


XXill, §4. STOKES’ THEOREM FOR A 
RECTANGULAR SIMPLEX 


Let 
R = [a,, 5, | xT" Xx La, 5, | 


be a rectangle in n-space, i.e. a product of n closed intervals. The set- 
theoretic boundary in R consists of the union over all i= 1, ...,n of the 
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pieces 
R? = [a,, 5, ] xX {ai} xX [a,, 5, J, 


Rj = [4,,b)] x ++ x ti} x x [an By] 


If 
OO(X15.025X_,) = f (X15 ---5X,) dX, AT A dx, A+ A aX, 


is an (n — 1)-form, and the roof over anything means that this thing is to 
be omitted, then we define 


SN 
by b; bn ~ 
O= vee a f (Xp 5 0005s. + 9X_) AX, +++ dx; °° dX, 
RO Qa; a; ayn 


t 


if i= j, and 0 otherwise. And similarly for the integral over Rj. We 
define the integral over the oriented boundary to be 


J See dah 


Stokes’ Theorem for Rectangles. Let R be a rectangle in an open set U 
in n-space. Let w be an (n — 1)-form on U. Then 


| dw -| Ww. 
R OR 


Proof. In two dimensions, the picture looks like this: 


It suffices to prove the assertion when w is a decomposable form, say 
LN 
(x) = f(X1,---5X_) AX, At A dx; Att A AXy. 


We then evaluate the integral over the boundary of R. If i 4 j, then it is 
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clear that | 


so that 
“N 
fb b; bn 
| o=(-0'| ral of Lf (X45. 5j, -.- Xp) 
COR ay aj ay 
—f (X15 266 5Dj, 66.5 X_)] dx ++ dx; + 
On the other hand, from the definitions we find that 


dats) = (SE ar, peg 


“N 
ix, A dx, N***A dx; N***A dx 
OX, 


of 


x; 


= (—1))"* — dx, n°: A dx,. 


(The (—1)/"! comes from interchanging dx; with dx,, ...,dx;-,. All other 
terms disappear by the alternation rule.) 

Integrating dw over R, we may use repeated integration and integrate 
0f/0x; with respect to x, first. Then the fundamental theorem of calculus 
for one variable yields 


b; C if 

| ay Xi = (X15 000 5Diy oe sXq) — f(y 5 +++ jy «+ Xq) 
a; j 

We then integrate with respect to the other variables, and multiply by 

(—1)/~*. This yields precisely the value found for the integral of w over 

the oriented boundary é°R, and proves the theorem. 


Remark. Stokes’ theorem for a rectangle extends at once to a version 
in which we parametrize a subset of some space by a rectangle. Indeed, 
if ¢: RV is a C’ map of a rectangle of dimension n into an open set V 
in R%, and if w is an (n — 1)-form in V, we may define 


| dw = | o* do. 

o R 

| O= | o*w, 
Co COR 


One can define 
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and then we have a formula 


In the next section, we prove a somewhat less formal result. 


XXill, §5. STOKES’ THEOREM ON A MANIFOLD 


Theorem 5.1. Let X be an oriented manifold of class C*, dimension n, 
and let w be an (n—1)-form on X, of class C'. Assume that w has 


compact support. Then 
| da = | Q. 
x ex 


Proof. Let {a;};., be a partition of unity, of class C?. Then 


> 4,0 = a, 
iel 


and this sum has only a finite number of non-zero terms since the 
support of w is compact. Using the additivity of the operation d, and 
that of the integral, we find: 


| do = d(a;@). 
x ieI JX 


Suppose that «, has compact support in some open set V, of X and that 


we can prove 
| d(a;w) = | X;W, 
V; Vinodx 


in other words we can prove Stokes’ theorem locally in V,. We can write 


| a; -| 0; CD, 
Vinex ax 


| d(a;c) -| d(a;). 
V; x 


Using the additivity of the integral one more, we get 


and similarly 


{ do=) | dao)= > _ O00 = [. oo, 


ieI Jx ie]! 
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which yields Stokes’ theorem on the whole manifold. Thus our argument 
with partitions of unity reduces Stokes’ theorem to the local case, namely 
it suffices to prove that for each point of X there exists an open neigh- 
borhood V such that if @ has compact support in V, then Stokes’ theo- 
rem holds with X replaced by V. We now do this. 

If the point is not a boundary point, we take an oriented chart (U, ¢) 
at the point, containing an open neighborhood V of the point, satisfying 
the following conditions: @U is an open ball, and @V is the interior of a 
rectangle, whose closure is contained in @U. If w has compact support 
in V, then its local representation in @U has compact support in oV. 
Applying Stokes’ theorem for rectangles as proved in the preceding sec- 
tion, we find that the two integrals occurring in Stokes’ formula are 
equal to 0 in this case (the integral over an empty boundary being equal 
to 0 by convention). 

Now suppose that we deal with a boundary point. We take an 
oriented chart (U, @) at the point, having the following properties. First, 
oU is described by the following inequalities in terms of local coordi- 
nates (x,,...5X,): 


—2<x,<1 and —2<x, <2 for j=2,...,n. 


Next, the given point has coordinates (1,0,...,0), and that part of U on 
the boundary of X, namely Un @X, is given in terms of these coordi- 
nates by the equation x, =1. We then let V consist of those points 
whose local coordinates satisfy 


0<x,<1 and -l<x, <1 for j=2,...,n. 


If @ has compact support in V, then w is equal to 0 on the boundary of the 
rectangle R equal to the closure of @V, except on the face given by x, = 1, 
which defines that part of the rectangle corresponding to 0X AV. Thus 
the support of w looks like the shaded portion of the following picture. 


In the sum giving the integral over the boundary of a rectangle as in the 
previous section, only one term will give a non-zero contribution, corre- 
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sponding to i = 1, which is 


[fehl 


Furthermore, the integral over R9? will also be 0, and in the contribution 
of the integral over Rj, the two minus signs will cancel, and yield the 
integral of w over the part of the boundary lying in V, because our 
charts are so chosen that (x,,...,x,) is an oriented system of coordinates 
for the boundary. Thus we find 


| to = | QO, 
V Vaex 


which proves Stokes’ theorem locally in this case, and concludes the 
proof of Theorem 5.1. 


For any number of reasons, some of which we consider in the next 
section, it is useful to formulate conditions under which Stokes’ theorem 
holds even when the form w does not have compact support. We shall 
say that w has almost compact support if there exists a decreasing se- 
quence of open sets {U,} in X such that the intersection 


8 


U;, 
k 


Il 
_ 


is empty, and a sequence of C' functions {g,}, having the following 
properties: 


AC 1. We have 0< 9, <1, 9g, =1 outside U,, and g,a@ has compact 
support. 


AC 2. If , is the measure associated with |dg, \ w| on X, then 


lim u,(U;,) = 0. 
k-0o 


We then have the following application of Stokes’ theorem. 
Corollary 5.2. Let X be a C? oriented manifold, of dimension n, and let 


w be an (n—1)-form on X, of class C*. Assume that w has almost 
compact support, and that the measures associated with |\dw| on X, and 


|w| on OX are finite. Then 
| dw = | Ow. 
x ax 
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Proof. By our standard form of Stokes’ theorem we have 


| no | dao) = | in, 00+ | g;, da. 
CX x X X 


We estimate the left-hand side by 


| o-| Gg, | (1 — g,)@ 
ex ex @x 


Since the intersection of the sets U, is empty, it follows for a purely 
measure-theoretic reason that 


im | no= | O. 
k>0 J@X ex 


im | a deo = | dw. 
k>o JX Xx 


The integral of dg, \ m over X approaches 0 as k— co by assumption, 
and the fact that dg, A w is equal to 0 on the complement of U, since g, 
is constant on this complement. This proves our corollary. 


S Meo (U,. 0 0X). 


Similarly, 


The above proof shows that the second condition AC 2 is a very 
natural one to reduce the integral of an arbitrary form to that of a form 
with compact support. In the next section, we relate this condition to a 
question of singularities when the manifold is embedded in some bigger 
space. 


XXIll, §6. STOKES’ THEOREM WITH SINGULARITIES 


If X is a compact manifold, then of course every differential form on X 
has compact support. However, the version of Stokes’ theorem which we 
have given is useful in contexts when we start with an object which is 
not a manifold, say as a subset of R”, but is such that when we remove a 
portion of it, what remains is a manifold. For instance, consider a cone 
(say the solid cone) as illustrated in the next picture: 
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The vertex and the circle surrounding the base disc prevent the cone 
from being a submanifold of R°. However, if we delete the vertex and 
this circle, what remains is a submanifold with boundary embedded in 
R°. The boundary consists of the conical shell, and of the base disc 
(without its surrounding circle). Another example is given by polyhedra, 
as on the following figure. 


The idea is to approximate a given form by a form with compact 
support, to which we can apply Theorem 5.1, and then take the limit. 
We shall indicate one possible technique to do this. 

The word “boundary” has been used in two senses: the sense of point 
set topology, and the sense of boundary of a manifold. Up to now, they 
were used in different contexts so no confusion could arise. We must 
now make a distinction, and therefore use the word boundary only in its 
manifold sense. If X is a subset of R%, we denote its closure by X as 
usual. We call the set theoretic difference X — X the frontier of X in RY, 
and denote it by fr(X). 

Let X be a submanifold without boundary of R¥, of dimension n. We 
know that this means that at each point of X there exists a chart for an 
open neighborhood of this point in R™ such that the points of X in this 
chart correspond to a factor in a product, just as in Chapter XXII, §2. A 
point P of X — X will be called a regular frontier point of X if there 
exists a chart at P in R” with local coordinates (x,,...,x,) such that P 
has coordinates (0, ...,0); the points of X are those with coordinates 


Xntp = = Xy =O and X, < 0; 


and the points of the frontier of X which lie in the chart are those with 
coordinates satisfying 


Xn = Xne1 = = Xy =O. 


The set of all regular frontier points of X will be denoted by 0X, and 
will be called the boundary of X. We may say that X UOX is a sub- 
manifold of R%, possibly with boundary. 

A point of the frontier of X which is not regular will be called 
singular. It is clear that the set of singular points is closed in R*. We 
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now formulate a version of Theorem 5.1 when @ does not necessarily 
have compact support in X UdX. Let S be a subset of R*. By a 
fundamental sequence of open neighborhoods of S we shall mean a se- 
quence {U,} of open sets containing S such that, if W is an open set 
containing S, then U, < W for all sufficiently large k. 

Let S be the set of singular frontier points of X and let m be a form 
defined on an open neighborhood of X, and having compact support. 
The intersection of supp @ with (X U0X) need not be compact, so that 
we cannot apply Theorem 5.1 as it stands. The idea is to find a funda- 
mental sequence of neighborhoods {U,} of S, and a function g, which is 
0 on a neighborhood of S and 1 outside U, so that g,@ differs from @ 
only inside U,. We can then apply Theorem 5.1 to g,@ and we hope 
that taking the limit yields Stokes’ theorem for o@ itself. However, we 


have 
| dao) = | ig no + | g, dw. 
x x x 


Thus we have an extra term on the right, which should go to 0 as k> 
if we wish to apply this method. In view of this, we make the following 
definition. 

Let S be a closed subset of R%. We shall say that S is negligible for X 
if there exists an open neighborhood U of S in R¥%, a fundamental 
sequence of open neighborhoods {U,} of S in U, with U,< U, and a 
sequence of C’ functions {g,}, having the following properties. 


NEG 1. We have 0<g, <1. Also, g,(x) =0 for x in some open neigh- 
borhood of S, and g,(x) = 1 for x € U,. 


NEG 2. If w is an (n — 1)-form of class C' on U, and , is the measure 
associated with |dg, \ @| on UN X, then yp, is finite for large 
k, and 


lim p,(U 0X) = 0. 
ko 


From our first condition, we see that g,q@ vanishes on an open neighbor- 
hood of S. Since g,=1 on the complement of U,, we have dg, =0 on 
this complement, and therefore our second condition implies that the 
measures induced on X near the singular frontier by |dg, A w| (for k = 
1,2,...), are concentrated on shrinking neighborhoods and tend to 0 as 
k— oo. 


Theorem 6.1 (Stokes’ Theorem with Singularities). Let X be an ori- 
ented, C? submanifold without boundary of R™. Let dim X =n. Let @ 
be an (n— 1)-form of class C! on an open neighborhood of X in RY", 
and with compact support. Assume that: 
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(i) If S is the set of singular points in the frontier of X, then 
Sc supp @ is negligible for X. 
(ii) The measures associated with |dw| on X, and |w| on OX, are finite. 


Then 
| dw = | Ww. 
x ax: 


Proof. Let U, {U,}, and {g,} satisfy conditions NEG 1 and NEG 2. 
Then g,w is 0 on an open neighborhood of S, and since w is assumed to 
have compact support, one verifies immediately that 


(supp g,@) A(X U 0X) 


is compact. Thus Theorem 5.1 is applicable, and we get 


| no= | dao) = | ig no + | g;, do. 

cX X X X 

| o-| J, 
CX CX 


We have 


s | (1 — ao 
ax 


= | 1 dp) = Meo(Uj,. 0 0X). 
U,.0¢eX 


Since the intersection of all sets U, 0X is empty, it follows for purely 
measure theoretic reasons that the limit of the right-hand side is 0 as 


k— oo. Thus 
lim | g,o = | O. 
ko J ax ax 


For similar reasons, we have 


im | a dor= | do. 
ko JX X 


Our second assumption NEG 2 guarantees that the integral of dg, A @ 
over X approaches 0. This proves our theorem. 


We shall now give criteria for a set to be negligible. 
Criterion 1. Let S, T be compact negligible sets for a submanifold X of 


R® (assuming X without boundary). Then the union SUT is negligible 
for X. 


[ XXIII, §6] STOKES’ THEOREM WITH SINGULARITIES 565 


Proof. Let U, {U,}, {g,} and V, {V,}, {h,} be triples associated with S 
and T, respectively, as in conditions NEG 1 and NEG 2 (with V re- 
placing U and h replacing g when T replaces S). Let 


W=UUYV, W, = U,U V,; and fe = Guy - 


Then the open sets {W,} form a fundamental sequence of open neighbor- 
hoods of SUT in W, and NEG 1 is trivially satisfied. As for NEG 2, we 
have 

d(g,h,) \o=h, dg, \o+ 9g, dh, 0 o, 


so that NEG 2 is also trivially satisfied, thus proving our criterion. 


Criterion 2. Let X be an open set, and let S be a compact subset in R". 
Assume that there exists a closed rectangle R of dimension m <n — 2 
and a C' map o: RR" such that S = a(R). Then S is negligible for X. 


Before giving the proof, we make a couple of simple remarks. First, 
we could always take m =n — 2, since any parametrization by a rectan- 
gle of dimension <n-—2 can be extended to a parametrization by a 
rectangle of dimension n—2 simply by projecting away extra coordi- 
nates. Second, by our first criterion, we see that a finite union of sets as 
described above, i.e. parametrized smoothly by rectangles of codimension 
> 2, is negligible. Third, our Criterion 2, combined with the first crite- 
rion, shows that negligibility in this case is local, i.e. we can subdivide a 
rectangle into small pieces. 

We now prove Criterion 2. Composing o with a suitable linear map, 
we may assume that R is a unit cube. We cut up each side of the cube 
into k equal segments and thus get k™ small cubes. Since the derivative 
of o is bounded on a compact set, the image of each small cube is 
contained in an n-cube in R" of radius < C/k (by the mean value theo- 
rem), whose n-dimensional volume is < (2C)"/k". Thus we can cover the 
image by small cubes such that the sum of their n-dimensional volumes 
is < (2C)"/k"™™ < (2C)"/k?. 


Lemma 6.2. Let S be a compact subset of R". Let U, be the open set 
of points x such that d(x, S) < 2/k. There exists a C® function g, on R" 
which is equal to 0 in some open neighborhood of S, equal to 1 outside 
U,, 0S 9, <1, and such that all partial derivatives of g, are bounded 
by C,k, where C, is a constant depending only on n. 


Proof. Let g be a C® function such that 0 < g < 1, and 


o(x)=90 if OS]xi S17, 
go(x)=1 if 1S [XI 
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We use || || for the sup norm in R". The graph of ¢ looks like this: 


For each positive integer k, let ,(x) = p(kx). Then each partial deriva- 
tive D,@, satisfies the bound 


|D;9,| < k|D;¢\, 


which is thus bounded by a constant times k. Let L denote the lattice of 
integral points in R". For each |e L, we consider the function 


l 
cas — x): 


This function has the same shape as q, but is translated to the point 
1/2k. Consider the product 


] 
g(x) = I] DP; (x —_ sz) 


taken over all !e L such that d(I/2k, S) < 1/k. If x is a point of R” such 
that d(x, S) < 1/4k, then we pick an / such that 


d(x, 1/2k) < 1/2k. 


For this | we have d(I/2k, S) < 1/k, so that this | occurs in the product, 
and 


0,(x — 1/2k) = 0. 


Therefore g, is equal to 0 in an open neighborhood of S. If on the other 
hand we have d(x, S) > 2/k and if | occurs in the product, that is 


d(l/2k, S) < 1/k, 
then 
d(x, l/2k) > 1/k 


and hence g,(x) = 1. The partial derivatives of g, are bounded in the 
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desired manner. This is easily seen, for if x) is a point where g, is not 
identically 1 in a neighborhood of x9, then ||x9 — /)/2k|| < 1/k for some 
Ip. All other factors @,(x — 1/2k) will be identically 1 near x, unless 
[Xo — 1/2k|| < 1/k. But then ||l — [)|| <4 whence the number of such / is 
bounded as a function of n (in fact by 9”). Thus when we take the 
derivative, we get a sum of at most 9” terms, each one having a deriva- 
tive bounded by C,k for some constant C,. This proves our lemma. 


We return to the proof of Criterion 2. We observe that when an 
(n — 1)-form q@ is expressed in terms of its coordinates, 


LN 
w(x) = > f(x) dx, A+++ A dx; A+++ A dXq, 


then the coefficients f; are bounded on a compact neighborhood of S. 
We take U, as in the lemma. Then for k large, each function 


xt fi(x)Djgi(x) 


is bounded on U, by a bound C,k, where C, depends on a bound for 
jw, and on the constant of the lemma. The Lebesgue measure of U, is 
bounded by C,/k*, as we saw previously. Hence the measure of U, 
associated with |dg, A w| is bounded by C,/k, and tends to 0 as k-> o. 
This proves our criterion. 

As an example, we now state a simpler version of Stokes’ theorem, 
applying our criteria. 


Theorem 6.3. Let X be an open subset of R". Let S be the set of 
singular points in the closure of X, and assume that S is the finite union 
of C' images of m-rectangles with m<n—2. Let w be an (n — 1)-form 
defined on an open neighborhood of X. Assume that w has compact 
support, and that the measures associated with |w| on 0X and with |da| 


on X are finite. Then 
| dw -| Q. 
xX Ox 


Proof. Immediate from our two criteria and Theorem 6.1. 


We can apply Theorem 6.3 when, for instance, X is the interior of a 
polyhedron, whose interior is open in R". When we deal with a sub- 
manifold X of dimension n, embedded in a higher dimensional space R”, 
then one can reduce the analysis of the singular set to Criterion 2 
provided that there exists a finite number of charts for X near this 
singular set on which the given form q@ is bounded. This would for 
instance be the case with the surface of our cone mentioned at the 
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beginning of the section. Criterion 2 is also the natural one when dealing 
with manifolds defined by algebraic inequalities. By using the resolution 
of singularities due to Hironaka one can parametrize a compact set of 
algebraic singularities as in Criterion 2. 

Finally, we note that the condition that w have compact support in an 
open neighborhood of X is a very mild condition. If for instance X is a 
bounded open subset of R", then X is compact. If w is any form on 
some open set containing X, then we can find another form y which is 
equal to w on some open neighborhood of X and which has compact 
support. The integrals of 74 entering into Stokes’ formula will be the 
same as those of w. To find 7, we simply multiply @ with a suitable C° 
function which is 1 in a neighborhood of X and vanishes a little further 
away. Thus Theorem 6.3 provides a reasonably useful version of Stokes’ 
theorem which can be applied easily to all the cases likely to arise 
naturally. 
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